Welcome to BLTsville

BLTsville's API is defined in the BLTsville header files. A client must include bltsville.h to access the implementations. This header includes the remaining headers (including ocd.h).

NOTE: The bvinternal.h header is for implementations only and should not be used by clients.

BLTsville has both user mode and a kernel mode interaces. The kernel mode interface is quite similar to (and compatible with) the user mode, but due to the minor differences and license issues, there are two different sets of header files.

BLTsville was based on a previous closed interface, which had a few implementations and shipped on a few devices. That interface represented the 1.x versions. A lot was learned from that work, and these lessons were used in the founding of BLTsville.

This was the initial release of the user mode interface. This version is not compatible with the 1.x versions. Several minor updates were posted, but the API itself did not change, so no changes to the client or implementation were required.

This is a minor update to the API, and it adds the kernel mode interface. Some additions to the API have been made. Details of the changes are below with their compatibility matrices.

Implementations may be software (CPU) or 2-D hardware, and many may coexist. Each implementation will have an individual entry point, so it can be directly addressed. But there will also be a more general interface for each of these two types of implementations so that system integrators can choose the most appropriate implementation. In other words, the system integrator will choose one software and one 2-D hardware implementation to be the "default" used when a client does not need to choose a specific implementation.

Clients use the standard names below to access the default implementations. The client then imports the pointers to the functions. (The specific name decoration and import method will be dictated by the host Operating System (O/S).) Some examples:

Usually these entry points will be symbolic links (either explicit in systems like Linux which support them, or implicit using a thin wrapper) to the specific implementation. This allows system integrators to connect the client with the most capable implementation available in the system. For example, bltsville_hw2d might be a symbolic link to bltsville_gc2d.

In addition, there may be more implementations co-existing in a given system. These will have additional unique names as determined by the vendors. For example:

In general, each O/S has the ability to manually load a library. This in turn causes a function in the library to be called so the library can perform initialization. Unfortunately, not all O/Ss allow this initialization function to return an error if the initialization fails. Equally unfortunately, it may be necessary for the initialization to be performed in that function. To accommodate this, BLTsville defers the specific initialization to the O/S environment.

The client will call dlopen() to open the library. It will then import the bv_*() functions, and call them as desired. Initialization will occur in association with one or more of these activities. If the initialization fails, the bv_*() functions will return the BVERR_RSRC error, indicating that a required resource was not obtained.

Implementations Only

If the library has designated a function with the __attribute__ ((constructor)), that function will be called. Linux implementations may use this function to perform initialization (including opening an interface to an associated kernel module). However, since this function cannot return an error, and thus cannot fail, if the initialization fails, this must be recorded. Then, when the client calls any of the bv_*() functions, these should immediately return the BVERR_RSRC error, indicating that the library was unable to initialize (obtain a necessary resource).

Linux implementations may also choose to initialize on the first call to a bv_*() function. Failure is likewise indicated by returning the BVERR_RSRC error.

NOTE: Be careful not to repeatedly attempt initialization when a failure is encountered. Some initializations, and especially initialization failures, can take a long time. This means clients trying to call bv_*() functions (presumably before falling back to alternatives) will be repeatedly penalized if the library can't initialize. Instead, attempt initialization once, and from them on return BVERR_RSRC.

For most kernel space BLTsville clients, only a 2-D hardware implementation will be used. However, both types of implementations are supported. Clients use the standard names below to access the default implementations and obtain pointers to the functions. (The specific method of obtaining the interface will be dictated by the host Operating System (O/S).) Some examples:

These entry points may represent the implementations themselves, but more likely they will link the client to the implementations using more specific names. For example, bv2d_entry() may link the client to gcbv_entry().

In addition, there may be more implementations co-existing in the kernel. These will require additional unique names as determined by the vendors. For example:

BLTsville's interface consists of three or four functions per implementation, which must be imported by the client at run time:

NOTE: If the library failed to initialize, these functions will return BVERR_RSRC, indicating that a required resource was not obtained.

BLTsville does not allocate buffers. Clients must describe a buffer in BLTsville using the bvbuffdesc structure so a given implementation can access the buffer.

bv_map() is used to provide the implementation an opportunity to associate hardware resources with the specified buffer. Most hardware requires this type of mapping, and there is usually appreciable overhead associated with it. By providing a separate call for this operation, BLTsville allows the client to move this overhead to the most appropriate time in its execution.

For a given buffer, the client can call the bv_map() function imported from each implementation to establish the mapping immediately. But this is not required.

As a special bonus, BLTsville clients can call to any implementation's bv_map(). This is sufficient to indicate that the client can be trusted to make the corresponding call to bv_unmap() upon destruction of the buffer. Then when a client calls an implementation's bv_blt(), if the mapping needs to be done, it's done at that time. But the mapping is maintained, so that the overhead is avoided on subsequent bv_blt() calls. This lets implementations use lazy mapping only as necessary. If an implementation is not called, the mapping is not done.

Normally, the lowest overhead bv_map() call will be in the CPU-based implementation. So most clients will want to make a single, low overhead bv_map() call to the bltsville_cpu implementation to avoid the mapping/unmapping overhead on each bv_blt() call, while avoiding the mapping overhead when possible.

Calling bv_map() is actually optional prior to calling bv_blt(). However, if it is not called at least once for a given buffer, it must be assumed that bv_unmap() will not be called. So the mapping must be done when bv_blt() is called, and unmapping done when it is complete. This means the overhead will be incurred for every bv_blt() call which uses that buffer.

NOTE: Obviously any API cannot add capabilities beyond an implementation's capabilities. So, for example, if an implementation requires memory to be allocated from a special pool of memory, that responsibility falls upon the client. The bv_map() function for that implementation will need to check the characteristics of the memory and return an error if it does not meet the necessary criteria.

To clarify, here are some function sequences and the operations associated with them:

The main function of BLTsville is bv_blt(). A bvbltparams structure is passed into bv_blt() to trigger the desired 2-D operation.

bv_unmap() is used to free implementation resources associated with a buffer. Normally, if bv_map() was called for a given buffer, bv_unmap() should be called as well.

For convenience, only one bv_unmap() needs to be called for each buffer, regardless of how many implementations were used, including multiple calls to bv_map().

Also for convenience, bv_unmap() may be called multiple times on the same buffer. Note that only the first call will actually free (all) the associated resources. See the Function Sequences under bv_map() for more details.

Implementations Only

Implementations must ensure that unmapping of buffers which are in use by asynchronous BLTs are appropriately delayed to avoid improper access.

bv_cache() provides manual CPU cache control to maintain cache coherence of surfaces between the CPU and other hardware. The bvcopparams structure provides the information needed to properly manipulate the CPU cache.

This function is optional. If this function fails to import, it means the implementation does not provide it, but bv_map(), bv_blt(), and bv_unmap() may still be used.

In general, this function will be provided with BLTsville implementations which utilize 2-D hardware, even though it manipulates the CPU cache. This is because most systems require a kernel module to manipulate the cache, and this is not always practical to include with a user-mode CPU implementation.

BEWARE: Manipulation of the CPU cache is tricky. Moreover, different CPUs behave differently, so cache manipulation that works on one device may fail on another. Also, mismanaged operation of the cache can have significant impact on overall system performance. And incorrect manipulation of the cache can cause instability or crashes. Please read and understand all of the discussions below before using this function.

bvbltparams is the central structure in BLTsville. This structure holds the details of the BLT being requested by the client.

This member is used to allow backwards and forwards compatibility between versions of BLTsville. It should be set to the sizeof() the structure by the client or implementation, whichever allocated the structure.

BLTsville is designed to be forwards and backwards compatible between client and library versions. But this compatibility would be eliminated if clients chose to check for a specific version of the BLTsville implementations and fail if the specific version requested was not in place. So, instead of exporting a version number, BLTsville structures use the structsize member to indicate the number of bytes in the structure. This is used to communicate between the client and implementation which portions of the structure exist. This effectively bypasses the concept of a version and focuses on the specifics of what changes need to be considered to maintain compatibility.

If structsize is set to a value that is too small for an implementation, it may return a BVERR_BLTPARAMS_VERS error.

errdesc is optionally used by implementations to pass a 0-terminated string with additional debugging information back to clients for debugging purposes. errdesc is not localized or otherwise meant to provide information that is displayed to users.

Multiple implementations of BLTsville can be combined under managers which can distribute the BLT requests to the implementations based on whatever criteria the manager chooses. This might include availability of the operation, performance, loading, or power state. In such a scenario, the client may need to override or augment the choice made by the manager. This field allows that control.

Note that this feature is extremely complicated, and more detailed documentation needs to be created to allow creation of managers and smooth integration by a client. There are serious issues that must be understood before any manager can be put into place, such as CPU cache coherence and multiple implementation operation interdependence. For now, this field should be set to 0 by clients.

If the implementation cannot respond to the implementation flags set, it may return a BVERR_IMPLEMENTATION error.

The flags member provides the baseline of information to bv_blt() about the type of BLT being requested.

If the flags set are not supported by the implementation, it may return BVERR_FLAGS, or a more specific error code.

The op field of the flags member specifies the type of BLT operation to perform. Currently there are three types of BLT operations defined:

The BVFLAG_KEY_SRC and BVFLAG_KEY_DST enable source and destination color keying, respectively. When either flag is set, the colorkey member of bvbltparams is used.

When BVFLAG_CLIP is set, the cliprect member of bvbltparams is used by the implementation as a limiting rectangle on data written to the destination. See cliprect for details.

Normally, the mask is applied at the destination, after all scaling has been completed (including scaling the mask if necessary). But some environments require that the mask be applied at the sources, before scaling occurs. The BVFLAG_SRCMASK flag requests that the implementation use this method if supported.

Normally, when a source's size does not match the destination, the source is scaled to fill the destination. But when the corresponding BVFLAG_TILE_* flag is set, this behavior is modified.

First, the source's size specifies a tile (or pattern, or brush) to be used to fill the destination. This tile is replicated instead of scaled.

The origin of the source's rectangle is used to locate the tile within a larger surface.

Second, a bvbuffdesc object is no longer supplied by the client in the bvbltparams structure. In its place is a bvtileparams object.

These flags indicate that the corresponding image is flipped horizontally or vertically as it is used by the operation.

The scale and dither types can be specified with an implicit type. The implementation will then convert that internally to an explicit scale or dither type. These flags request that the implementation return the explicit type chosen to the client in the corresponding bvbltparams.scalemode and bvbltparams.dithermode members.

This flag allows the client to inform the implementation that it can queue the requested BLT and return from bv_blt() before it has completed. If this bit is not set, when the bv_blt() returns, the operation is complete.

NOTE: Asynchronous BLTs are performed in the order in which they are submitted within an implementation. This was done to provide a simple dependency mechanism. However, synchronization between implementations must be handled by the client, using the callback mechanism.

NOTE: Since asynchronous BLTs are performed in the order in which they are submitted, it follows that a synchronized BLT after a set of asynchronous BLTs may be used as synchronization as well.

NOTE: Certain situations may require manual synchronization without an associated BLT. Rather than introduce an additional BLTsville function call, the method of handling this will be via a NOP BLT. To accomplish a NOP BLT, the client should issue a BLT using the bvbltparams.op.rop code of 0xAAAA (copy destination to destination), and with the BVFLAG_ASYNC flag not set. Alternatively, the NOP BLT may set the BVFLAG_ASYNC and provide a bvbltparams.callbackfn. To facilitate implementations, a valid destination surface should be specified.

Implementations Only

In general, this BLTsville specification has avoided placing any requirement on implementations for specific operations. However, in support of this special case, support for these NOP BLTs will need to be an implementation requirement.

These flags are used to indicate that the bvbltparams.src2auxdstrect and bvbltparams.maskauxdstrect are to be used. See these entries below for details. These flags are likely to be ignored except for the special case explained below, so they should be used only when necessary.

When BVFLAG_ROP is set in the bvbltparams.flags member, the bvbltparams.op union is treated as rop. Raster OPerations are binary operations performed on the bits of the inputs:

BLTsville's rop element is used to specify a ROP4, but anything from ROP1 up to ROP4 can be defined using this member:

NOTE: By far the most common ROP used will be 0xCCCC, which indicates a simple copy from source 1 to the destination.

bvblend is an enumeration assembled from sets of fields. The values specified may be extended beyond those that are explicitly defined using the definitions in the bvblend.h header file.

The first 4 bits are the format. Currently two format groups are defined, but others can be added. The remainder of the bits are used as defined by the individual format:

To specify the filter, the client fills in filter with one of the bvfilter values.

The format of this pixel matches the surface being keyed. i.e. src1geom.format is the format of the color key if BVFLAG_KEY_SRC is set, or dst.format is the format of the color key if BVFLAG_KEY_DST is set.

When BVFLAG_BLEND is set in the bvbltparams.flags, and when the blend chosen requires it, globalalpha is used to provide an alpha blending value for the entire operation. The type is also dependent on the blend chosen.

For the BVBLENDDEF_FORMAT_CLASSIC blend types, if the BVBLENDDEF_GLOBAL_MASK field is not 0, this field is used. Currently BVBLENDDEF_FORMAT_CLASSIC provides for an 8-bit (unsigned character / byte) format designated by BVBLENDDEF_GLOBAL_UCHAR as well as a 32-bit floating point format designated by BVBLENDDEF_GLOBAL_FLOAT.

This member allows the client to specify the type of scaling to be used. The enumeration begins with 8 bits indicating the vendor. The remaining bits are defined by the vendor. BVSCALEDEF_VENDOR_ALL and BVSCALEDEF_VENDOR_GENERAL are shared by all implementations.

BVSCALEDEF_VENDOR_ALL can be used to specify an implicit scale type. This type is converted to an explicit type by the implementation:

If the client wants to know the explicit type chosen by a given implementation, it can set BVFLAG_SCALE_RETURN in the bvbltparams.flags member, and the explicit scale type is returned in the scalemode member.

NOTE: Extending the BVSCALEDEF_VENDOR_GENERAL scale types or obtaining a vendor ID can be accomplished by submitting a patch.

This member allows the client to specify the type of dithering to be used, when the output format has fewer bits of depth than the internal calculation. The enumeration begins with 8 bits indicating the vendor. The remaining bits are defined by the vendor. BVDITHERDEF_VENDOR_ALL and BVDITHERDEF_VENDOR_GENERAL are shared by all implementations.

BVDITHERDEF_VENDOR_ALL can be used to specify an implicit dither type. This type is converted to an explicit type by the implementation:

If the client wants to know the explicit type chosen by a given implementation, it can set BVFLAG_DITHER_RETURN in the bvbltparams.flags member, and the explicit scale type is returned in the dithermode member.

NOTE: Extending the BVDITHERDEF_VENDOR_GENERAL scale types or obtaining a vendor ID can be accomplished by submitting a patch.

dstdesc is used to specify the destination buffer. If the buffer has not been mapped with a call to bv_map(), bv_blt() will map the buffer as necessary to perform the BLT and then unmap afterwards. See bvbuffdesc for details.

dstgeom is used to specify the geometry of the surface contained in the destination buffer. See bvsurfgeom for details.

These members are used to identify the buffer for the source1, source2, and mask surfaces when the associated BVFLAG_TILE_* flag is not set. The buffer is the memory in which the surface lies. See the bvbltparams.src1/src2/maskgeom for the format and layout/geometry of the surface.

NOTE WELL: Clients should never change the value of a bvbuffdesc structure while a buffer is mapped.

These members are used to identify the buffer for the source1, source2, and mask surfaces when the associated BVFLAG_TILE_* flag is set. The buffer is the memory in which the surface lies. This differs from the src1/src2/mask.desc identity by providing more information needed for tiling and by not requiring mapping (for hardware implementations that support tiling, the tile data is usually moved into an on-chip cache).

These members describe the format and layout/geometry of their respective surfaces. Separating bvsurfgeom from the bvbuffdesc allows easy use of buffers for multiple geometries without remapping. See bvsurfgeom and bvbuffdesc for details.

These members specify the rectangle from which data is read for the BLT. These rectangles are clipped by a scaled version of the bvbltparams.cliprect (scaling is based on the relationship between them and the bvbltparams.dstrect) when BVFLAG_CLIP is set in the bvbltparams.flags member.

This approach allows fractional clipping at the source using a method which is simpler to implement than fractional coordinates.

NOTE: In BLTsville, reading outside the source rectangle is forbidden. So scaling algorithms which require pixels around a particular source pixel must utilize boundary techniques (mirror, repeat, clamp, etc.) at the edges of the source rectangle. However, if the clipping rectangle, when translated back to the source rectangle, leaves space between it and the source rectangle, pixels outside the clipped region may be accessed by the implementation.

cliprect is used to specify a rectangle that limits what region of the destination is written. This is most useful for scaling operations, where the necessary scaling factor will not allow translation of the destination rectangle back to the source on an integer pixel boundary.

For example, if the goal is to show a 640 x 480 video on a 1920 x 1080 screen, the video would be stretched to 1440 x 1080 to maintain the proper aspect ratio. So the relevant rectangles would be:

However, to handle a 640 x 480 pop-up window that appears centered on the screen, in front of the video, the single BLT may be broken into four smaller BLTs pieced around the popup. These rectangles would need to be:

Since this is a scaling factor of 2.25x, translating the required destination rectangles back to the source results in non-integer coordinates and dimensions, as illustrated above. And adjusting the source rectangles to the nearest integer values will result in visible discontinuities at the boundaries between the rectangles.

batchflags are used by the client as a hint to indicate to the implementation which parameters are changing between successive BLTs of a batch. The flags may be used when the bvbltparams.flags has BVFLAG_BATCH_CONTINUE or BVFLAG_BATCH_END set.

NOTE: These flags are hints, and may be used or not by a BLTsville implementation. So if bvbltparams members are changed between BLTs in a batch, but the bvbltparams.batchflags member is not correctly updated, the resulting behavior on different implementations will not be consistent.

This member is used as a batch handle, so that multiple batches can be under construction at the same time.

This member is a pointer to a client-supplied function which is called by the implementation when BVFLAG_ASYNC is set and the BLT is complete. If this member is NULL, no callback is performed. When there is no error, the err parameter will be set to 0;

This member is used as the parameter passed back by the bvbltparams.callbackfn. This can be anything from an identifying index to a pointer used by the client.

These two members are used only when the associated BVFLAG_SRC2/MASK_AUXDSTRECT flags are set. They are only necessary (and should only be used) in the case where scaling of the inputs differs and the entire source images are not being used. bvbltparams.dstrect is always used to specify the destination of source 1 image. When the associated flags are set, these two members are used to specify the destination of the source 2 and mask images, instead of bvbltparams.dstrect.

These flags must be used with the BVFLAG_CLIP flag. And if the resulting clipped destination does not include all enabled destination rectangles, the results are undefined.

Example: We have two images that we want to merge and view on an 854x480 LCD panel. One image is a small background image with 16:9 (64x36) aspect ratio that we want to stretch to fill the screen. The other is a standard definition 720x480 (4:3 aspect ratio) image with transparency we want to blend on top of our background.

(shown actual size)

(shown 1/2x; not adjusted for aspect ratio)

We want to blend the second image onto the center of the first, scaling both, so that it looks like this:

(shown 1/2x)

The screen is effectively a 16:9 aspect ratio (we can ignore the fraction of a pixel here), which matches our background image. So the background image just needs to be scaled from 64x36 to 854x480.

However, since the second image has a 4:3 aspect ratio, it will not cover the entire background image if we want to maintain its aspect ratio. Our second image is not as wide as our 16:9 image, which means it's height will match the screen height, but the width will be smaller. Since the screen is 480 lines (pixels) high, to maintain our 4:3 aspect ratio, our second image will need to be 640 pixels wide (4 * 480 / 3). So it will need to be scaled from 720x480 to 640x480.

As we mentioned, we would like to center the 640 pixel image on our 854 pixel wide screen. That means the left edge of the image will be at pixel 107 ( (854 - 640) / 2 ). So the leftmost 107 columns of pixels will just be a copy of the left portion of the background image. Likewise, the rightmost 107 columns will be a copy of the right portion of the background image. Only the middle section should be blended.

(shown 1/2x)

The side two BLTs are quite easy with BLTsville, by using the clipping rectangle:

However, if we try the same approach with the middle BLT, we run into problems:

bvbltparams.flags = BVFLAG_BLEND | BVFLAG_CLIP;
bvbltparams.op.blend = BVBLEND_SRC1OVER;

bvbltparams.src1.desc = foregnddesc;
bvbltparams.src1geom = foregndgeom;
bvbltparams.src1rect.left = 0;
bvbltparams.src1rect.top = 0;
bvbltparams.src1rect.width = 720;
bvbltparams.src1rect.height = 480;

bvbltparams.src2.desc = bkgnddesc;
bvbltparams.src2geom = bkgndgeom;
bvbltparams.src2rect.left = 0;
bvbltparams.src2rect.top = 0;
bvbltparams.src2width = 64;
bvbltparams.src2height = 36;

bvbltparams.cliprect.left = 107;
bvbltparams.cliprect.top = 0;
bvbltparams.cliprect.width = 640;
bvbltparams.cliprect.height = 480;
bv_blt(&bvbltparams);

(shown 1/2x)

The result is that the foreground image is stretched horizontally. That's because the scaling factor is derived from the source (1) rectangle and the destination rectangle, which is the full width of the screen. Since we were also scaling the background, we set the destination rectangle to cover the screen, as we did in the previous two BLTs.

The edges of our foreground image are also cropped, since we were only modifying the middle of the screen.

What if we change the destination rectangle?

bvbltparams.dstrect.left = 107;
bvbltparams.dstrect.top = 0;
bvbltparams.dstrect.width = 640;
bvbltparams.dstrect.height = 480;

bv_blt(&bvbltparams);

(shown 1/2x)

Here we get the proper scaling of the foreground image, but the background image is scaled improperly.

What if we adjust the source rectangles? For our purposes, we want all of the foreground image, but we only need the middle of the background image. So we can manually specify the middle of the background image by modifying the source 2 rectangle:

bvbltparams.src2rect.left = 107 * 64 / 854;
bvbltparams.src2rect.width = 640 * 64 / 854;

Nice, but what are those values?

107 * 1280 / 854 = 8.0187...
640 * 1280 / 854 = 47.9625...

In BLTsville, all rectangle parameters are expressed in integers (this also allows BLTsville to be used in the kernels where floating point variables are not allowed). The clipping rectangle then handles introducing the necessary source pixel subdivision (by translating the clipping rectangle back to the source rectangle in the implementation). So what happens if we actually do use these values as integers?

bvbltparams.src2rect.left = 8;
bvbltparams.src2rect.top = 0;
bvbltparams.src2rect.width = 47;
bvbltparams.src2height = 36;

bv_blt(&bvbltparams);

And this is what we get:

(shown 1/2x)

Closer, but not quite. Rounding the values above to integers still results in visible errors at the boundaries between the middle and the side BLTs (the one on the right is a bit more visible at this reduced size, but if you view the full image, you'll see the left one as well), because the left edge and scaling (and right edge as a result) don't match the alignment and scaling done for the BLTs on the side.

NOTE: This artifact is not always obvious in still images. The images here were chosen to make the artifacts obvious in this documentation. But even if the static images appear correct, movement of the images (e.g. moving the foreground image across the background image) or changes in the blending (e.g. fading the foreground image out and finally removing it), will show these less obvious discrepancies.

This is actually what the clipping rectangle is for. It's meant to allow us to always specify the source and destination rectangles the same, but move the clipping window around on the destination to get just the pixels we want. That way the scaling and alignment area always the same. Unfortunately, for this special case, we really need a way to specify different scaling factors for the different inputs. The src2auxdstrect (and maskauxdstrect, when needed) have been added to provide this capability.

Here is how this set of BLTs can be done:

bvbltparams.flags = BVFLAG_ROP | BVFLAG_CLIP;
bvbltparams.op.rop = 0xCCCC;

bvbltparams.src1.desc = bkgnddesc;
bvbltparams.src1geom = bkgndgeom;
bvbltparams.src1rect.left = 0;
bvbltparams.src1rect.top = 0;
bvbltparams.src1width = 64;
bvbltparams.src1height = 36;

bvbltparams.dstdesc = screendesc;
bvbltparams.dstgeom = screengeom;
bvbltparams.dstrect.left = 0;
bvbltparams.dstrect.top = 0;
bvbltparams.dstrect.width = 854;
bvbltparams.dstrect.height = 480;

bvbltparams.cliprect.left = 0;
bvbltparams.cliprect.top = 0;
bvbltparams.cliprect.width = 107;
bvbltparams.cliprect.height = 480;
bv_blt(&bvbltparams);

bvbltparams.cliprect.left += 640;
bv_blt(&bvbltparams);

bvbltparams.flags = BVFLAG_BLEND | BVFLAG_CLIP | BVFLAG_SRC2_AUXDSTRECT;
bvbltparams.op.blend = BVBLEND_SRC1OVER;

bvbltparams.src1.desc = foregnddesc;
bvbltparams.src1geom = foregndgeom;
bvbltparams.src1rect.left = 0;
bvbltparams.src1rect.top = 0;
bvbltparams.src1rect.width = 720;
bvbltparams.src1rect.height = 480;

bvbltparams.dstrect.left = 107;
bvbltparams.dstrect.top = 0;
bvbltparams.dstrect.width = 640;
bvbltparams.dstrect.height = 480;

bvbltparams.src2.desc = bkgnddesc;
bvbltparams.src2geom = bkgndgeom;
bvbltparams.src2rect.left = 0;
bvbltparams.src2rect.top = 0;
bvbltparams.src2width = 64;
bvbltparams.src2height = 36;

bvbltparams.src2auxdstrect.left = 0;
bvbltparams.src2auxdstrect.top = 0;
bvbltparams.src2auxdstrect.width = 854;
bvbltparams.src2auxdstrect.height = 480;

bvbltparams.cliprect.left = 107;
bvbltparams.cliprect.top = 0;
bvbltparams.cliprect.width = 640;
bvbltparams.cliprect.height = 480;
bv_blt(&bvbltparams);

Using this approach, we get the desired output:

(shown 1/2x)

It may also be clear that in that last BLT, the clip rectangle isn't really necessary. This is good, because it frees up the clipping rectangle to be used to further subdivide the image if necessary (e.g. if partially occluded).

struct bvrect {
    int left;
    int top;
    unsigned int width;
    unsigned int height;
};

This member indicates the left edge of the rectangle, measured in pixels from the left edge of the surface. Note that this value can be negative, indicating that the rectangle begins before the left edge of the surface. However, this is only allowed when a rectangle is clipped to the surface. If, after clipping, the left edge of the rectangle is still negative, this is an error.

This member indicates the top edge of the rectangle, measured in lines of bvbuffdesc.virtstride bytes from the top edge of the surface. Note that this value can be negative, indicating that the rectangle begins before the top edge of the surface. However, this is only allowed when a rectangle is clipped to the surface. If, after clipping, the top edge of the rectangle is still negative, this is an error.

This member indicates the width of the rectangle, measured in pixels. Note that this value cannot be negative. (Horizontal flipping is indicated using the BVFLAG_HORZ_FLIP_* flags.) The value of this member may exceed the width of the associated surface. However, this is only allowed when a rectangle is clipped to the surface. If, after clipping, the right edge of the rectangle still exceeds the width of the surface, this is an error.

This member indicates the height of the rectangle, measured in lines of bvbuffdesc.virtstride bytes. Note that this value cannot be negative. (Vertical flipping is indicated using the BVFLAG_VERT_FLIP_* flags.) The value of this member may exceed the width of the associated surface. However, this is only allowed when a rectangle is clipped to the surface. If, after clipping, the right edge of the rectangle still exceeds the height of the surface, this is an error.

bvcopparams is used to define the cache operation to be performed by bv_cache().

This member is used for compatibility between BLTsville versions. (See bvbltparams.structsize for an explanation.)

This member points to the bvbuffdesc of the surface for which the cache is being manipulated. This buffer should have been mapped with a call to bv_map().

NOTE: Implementations may choose to dynamically map the surface as with bv_blt(), however in many systems, this will not function properly due to dynamic paging which can occur when a surface is not locked.

This member points to the bvsurfgeom of the surface for which the cache is being manipulated.

This member points to the bvrect describing the rectangle of the surface which is being manipulated.

This member specifies the cache operation to be performed. It is an enumeration from the following list:

This structure is used in conjunction with a bvsurfgeom structure to specify the characteristics of a graphic surface. This structure specifies the memory buffer itself.

This member is used for compatibility between BLTsville versions. (See bvbltparams.structsize for an explanation.)

This member is used to indicate the CPU virtual address of the start of the buffer. This value must be provided unless the auxtype/auxptr members below are used. At that time, this member is optional, and the auxptr usually has higher priority than this member.

Implementations Only

Note that this is always the beginning of the buffer. This means that if the bvsurfgeom.virtstride is negative, or the bvsurfgeom.orientation does not normalize to 0º (i.e. orientation % 360 != 0), implementations may need to use a modified version of virtaddr internally to operate correctly.

This member is used by the implementations and should NEVER be manipulated by the client. When the bvbuffdesc structure is created, this member should be set to 0, indicating that no implementations have mapped the buffer. After a buffer has been mapped using a call to bv_map(), this member should be left as-is by clients. (The implementation will set this back to 0 before returning from bv_unmap().)

Implementations Only

This member points to a linked list of bvbuffmap structures associated with the buffer. Each bvbuffmap is added to the list as the buffer is mapped by a given implementation. This may be done with an explicit call to bv_map(), or implicitly with a call to bv_blt(), after a call to bv_map() from a different implementation.

Implementations should not assume that the first entry in the list is their bvbuffmap. Instead, implementations should compare the bv_unmap() pointer in the structure to their own function address.

This member is used to identify the type of additional information about the buffer provided by auxptr. Currently no values are defined for the user mode interface, so it should be initialized to 0 or BVAT_NONE. See the Kernel Mode Interface for details on the values defined for the kernel mode interface.

This member is used to point to additional information about the buffer. The type of this pointer is determined by the auxtype value. Currently there are no types defined for the user mode interface, so this member is ignored. See the Kernel Mode Interface for details on the types defined for the kernel mode interface.

This structure is used in conjunction with a bvbuffdesc structure to specify the characteristics of a graphic surface. This structure specifies the surface geometric characteristics.

NOTE: This structure was separated from bvbuffdesc to afford much flexibility to the client. Using the same bvbuffdesc structure with different bvsurfgeom structures or using the same bvsurfgeom structure with different bvbuffdesc structures may be of benefit. See the examples at the bottom of this section.

struct bvcopparams {
        unsigned int structsize;
        enum ocdformat format;
        unsigned int width;
        unsigned int height;
        int orientation;
        long virtstride;
        enum ocdformat paletteformat;
        void *palette;
};

This member is used for compatibility between BLTsville versions. (See bvbltparams.structsize for an explanation.)

This member specifies the width of the surface in pixels. This size does not have to be equivalent to the virtstride size.

Implementations Only

Implementations should never assume that width is equivalent to virtstride.

This member specifies the orientation or angle of the surface in degrees. Since BLTsville is designed only to specify orthogonal rectangles, this value must be a multiple of 90º. This value may be negative. (Extending BLTsville to handle non-orthogonal rectangles may be considered if there is sufficient interest.)

Implementations Only

Implementations should normalize orientation angles. For example, a client that sets the orientation to -450º should behave as if the value of 270º were specified.

This member specifies the horizontal stride of the surface in bytes for an unrotated surface. The stride represents the number of bytes needed to move from one pixel to the pixel immediately below it. This value may be negative.

NOTE: This means the orientation does not affect the virtstride. However, rotating a surface usually results in a different configuration (i.e. width), which will affect the virtstride. For example, a 320 x 240 x 32 bpp 0º surface might have a virtstride of 1280 bytes (320 pixels/line * 32 bits/pixel / 8 bits/byte). When the orientation is set to 180º, the virtstride would be the same. But when the orientation is set to 90º (or 270º), the virtstride would most likely need to be set to 960 bytes (240 pixels/line * 32 bits/pixel / 8 bits/byte).

Implementations Only

Implementations that do not support a negative virtstride must compensate using whatever mechanism is appropriate for the implementation. For example, using a vertical flipping/mirroring setting.

NOTE: The virtstride name must be maintained for backwards compatibility. However, no situation should arise where the client would need to provide two different strides for the virtual and physical views of a surface (there are situations where a physical stride will need to be available within the implementation, but the client will not be the one to supply it), so physstride will most likely never be needed. However, when a client provides a physical description of the buffer (see the Kernel Mode Interface section below), the virtstride entry should be used to provide the physical stride.

This member points to a palette used for palettized formats. The format of the palette is specified by the paletteformat member. Palettes are packed based on their container size:

Mixing and matching bvbuffdesc and bvsurfgeom structures provides maximum flexibility for a client.

This structure is used to define the parameters necessary to use a small image as a tile or block that will be repeated when used as a source. This structure is used in conjunction with the associated bvsurfgeom and the associated bvrect to determine the operation that is performed.

struct bvcopparams {
        unsigned int structsize;
        unsigned long flags;
        void *virtaddr;
        int dstleft;
        int dsttop;
        unsigned int srcwidth;
        unsigned int srcheight;
};

This member is used for compatibility between BLTsville versions. (See bvbltparams.structsize for an explanation.)

This member specifies some additional information for the tiling operation. It can be composed as the binary OR of one selection for each edge (left, top, right, and bottom) from the following flags:

This member is used to indicate the CPU virtual address of the start of the buffer.

This member is used to designate the left edge of the location of the tile in the destination for alignment purposes (alignment location). Note that the bvrect of the destination specifies the region which is filled by the tile.

This member is used to designate the top edge of the location of the tile in the destination for alignment purposes (alignment location). Note that the bvrect of the destination specifies the region which is filled by the tile.

This member is used to designate the width of the source for purposes of scaling. The relationship between this field and the bvrect.width of the associated source surface determines the horizontal scaling factor.

This member is used to designate the height of the source for purposes of scaling. The relationship between this field and the bvrect.height of the associated source surface determines the vertical scaling factor.

This structure is used to provide error information to the client of a BLT that failed within an asynchronous operation. The errors will be limited to those that occur within the implementation.

NOTE: Parameter errors should never be returned in this structure. These should have been returned to the client before the BLT was ever initiated.

This member is used for compatibility between BLTsville versions. (See bvbltparams.structsize for an explanation.)

This member is used to indicate the error encountered. In general, these will be error like these:

Batching is the single most powerful feature in BLTsville. It is used for two major purposes:

NOTE: It is important to realize that BLTs batched together may be done in any order, and in fact may not even be done in the way specified. This includes the BLTs being done as they are submitted, or no operations performed until the batch submission is completed with BVFLAG_BATCH_END. This means the client must not rely on intermediate results within a batch.

NOTE: Because BLTs can be performed in a variety of ways, callbacks for individual BLTs would have no consistent meaning. So, when batching is mixed with BVFLAG_ASYNC, only the callback for the last BLT occurs.

NOTE: Since implementations can perform batched BLTs in a variety of ways, even synchronous batched BLTs can be effectively asynchronous. Therefore, only the last BLT determines the synchronicity of the entire batch. i.e. the BVFLAG_ASYNC flag is only heeded when combined with BVFLAG_BATCH_END.

NOTE: Failure during the performance of a batch (different from an error on submission--indicated by the contents of the bvcallbackerror structure) will result in an unknown state for all destination buffers. Do not assume that a given implementation's state in this case represents the state which will be encountered for a different implementation.

NOTE: Because of the indeterminate nature of the execution of a batch of BLTs, a "batch abort" would not result in a known state either. As stated above, a given implementation may have already performed earlier BLTs in a batch as the batch is submitted. So errors encountered during the submission of a batch must be handled by the client, and then the batch must be terminated normally using BVFLAG_BATCH_END.

Often, groups of similar BLTs are performed, with changes to only a few parameters. Some implementations have the ability to re-use previous settings, coupled with these changes, to perform new BLTs.

One good example of this in in rendering text, similar to that you are reading now. In most systems, a glyph cache is maintained to hold the characters of a given font, rasterized with the specific characteristics desired (e.g. bold, italics, etc.). Each font in the glyph cache is normally created using a font rasterization engine from a vector-based font, such as FreeType. This technology allows fonts to be described in terms of curves and lines instead of pixels, which means they can be created as needed, in any size desirable.

Then, when a character needs to be rendered, it is copied from the pre-rendered glyph cache. This is much more efficient than performing the font rasterization from the vector description each time a character is used.

With some hardware implementations, the setup to trigger the copy of these characters from the glyph cache to the target surface can be quite significant, when compared to the number of pixels actually affected. For example, each character might consist of something on the order of 10 x 14, or about 140 pixels. Programming a typical hardware BLTer may require tens of commands for each character.

But note that each of these BLTs differs by only a few parameters. Specifically, once the source and destination surfaces have been specified, and the operation described, only the source and destination rectangles change between BLTs. To alleviate much of this overhead, most implementations will allow the configuration of a previous BLT to be used again, with only those parameters which change provided for the subsequent BLTs.

For rendering a word using a monospaced font like this, the client might construct the batch like this:

struct bvbuffdesc screendesc = {sizeof(struct bvbuffdesc}, 0};
struct bvsurfgeom screengeom = {sizeof(struct bvsurfgeom), 0};
struct bvbuffdesc glyphcachedesc = {sizeof(struct bvbuffdesc), 0};
struct bvsurfgeom glyphcachegeom = {sizeof(struct bvsurfgeom), 0};
struct bvtileparams solidcolortileparams = {sizeof(struct bvtileparams), 0};
struct bvbuffgeom solidcolorgeom = {sizeof(struct bvsurfgeom), 0};

struct bvbltparams bltparams = {sizeof(struct bvbltparams), 0};

int charsperline = 32;
int fontwidth = 10;
int fontheight = 14;
int i = 0;

screendesc.virtaddr = screenaddr;
screendesc.length = screenstride * screenheight;
screengeom.format = OCDFMT_RGB24;
screengeom.width = screenwidth;
screengeom.height = screenheight;
screengeom.virtstride = screenstride;

glyphcachedesc.virtaddr = glyphcacheaddr;
glyphcachedesc.length = glyphcachestride * glyphcacheheight;
glyphcachegeom.format = OCDFMT_ALPHA8;
glyphcachegeom.width = glyphcachewidth;
glyphcachegeom.height = glyphcacheheight;
glyphcachegeom.virtstride = glyphstride;

solidcolortileparams.virtaddr = &solidcolor;
solidcolortileparams.srcwidth = 1;
solidcolortileparams.srcheight = 1;
solidcolorgeom.format = OCDFMT_RGB24;

bltparams.flags = BVFLAG_BLEND | BVFLAG_SRC1_TILED | BVFLAG_BATCH_BEGIN;
bltparams.op.blend = BVBLEND_SRCOVER + BVBLENDDEF_REMOTE;
bltparams.dstdesc = &screendesc;
bltparams.dstgeom = &screengeom;
bltparams.src1.tileparams = &solidcolortileparams;
bltparams.src1geom = &solidcolorgeom;
bltparams.src2.desc = &screendesc;
bltparams.src2geom = &screengeom;
bltparams.mask.desc = &glyphcachedesc;
bltparams.maskgeom = &glyphcachegeom;

bltparams.dstrect.left = bltparams.src2rect.left = screenrect.left;
bltparams.dstrect.top = bltparams.src2rect.top = screenrect.top;

bltparams.maskrect.width = bltparams.dstrect.width = bltparams.src2rect.width = fontwidth;
bltparams.maskrect.height = bltparams.dstrect.height = bltparams.src2rect.height = fontheight;

bltparams.maskrect.left = ((text[i] - ' ') % charsperline) * fontwidth;
bltparams.maskrect.top = ((text[i] - ' ') / charsperline) * fontheight;

bv_blt(&bltparams);

i++;
if(i < textlen)
{
bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_CONTINUE;
bltparams.batchflags = BVBATCH_DSTRECT_ORIGIN | BVBATCH_SRC2RECT_ORIGIN | BVBATCH_MASKRECT_ORIGIN;

do
{
    bltparams.dstrect.left += fontwidth;
    bltparams.src2rect.left = bltparams.dstrect.left;

    bltparams.maskrect.left = ((text[i] - ' ') % charsperline) * fontwidth;
    bltparams.maskrect.top = ((text[i] - ' ') / charsperline) * fontheight;

    bv_blt(&bltparams);

    i++;
}while(i < textlen);
}

bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_END;
bltparams.batchflags = BVBATCH_ENDNOP;

bv_blt(&bltparams);

NOTE: bvbltparams.batchflags is just a hit. Not all implementations support deltas in batching, so clients must not change the values of members of bvbltparams (or structures it references) between BLTs. These values may be used.

Enabling special features of some implementations is a special challenge. But BLTsville is up the task.

For example, perhaps an implementation is capable of blending four layers at the same time. But BLTsville only allows blending to be specified using two layers at a time. How can this be accomplished?

The most prevalent blending reference used is the Porter-Duff whitepaper, which specifies blending of two sources (A and B). So any N-source blend (N > 2) would require the blends to be specified as a grouping of N - 1 two-source blends in order to utilize the Porter-Duff equations. That's how such a blend is specified in BLTsville:

bltparams.dstrect.width = bltparams.src1rect.width = bltparams.src2rect.width = dstgeom.width;
bltparams.dstrect.height = bltparams.src1rect.height = bltparams.src2rect.height = dstgeom.height;

bltparams.flags = BVFLAG_BLEND | BVFLAG_BATCH_BEGIN;
bltparams.op.blend = BVBLEND_SRCOVER;
bltparams.dstdesc = &dstdesc;
bltparams.dstgeom = &dstgeom;
bltparams.src1.desc = &src1desc;
bltparams.src1geom = &src1geom;
bltparams.src2.desc = &src2desc;
bltparams.src2geom = &src2geom;

bv_blt(&bltparams);

bltparams.src1.desc = &src3desc;
bltparams.src1geom = &src3geom;
bltparams.dstdesc = &dstdesc;
bltparams.dstgeom = &dstgeom;

bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_CONTINUE;
bltparams.batch = BVBATCH_SRC1 | BVBATCH_SRC2;

bv_blt(&bltparams);

bltparams.src1.desc = &src4desc;
bltparams.src1geom = &src4geom;

bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_END;
bltparams.batch = BVBATCH_SRC1;

bv_blt(&bltparams);

The driver for an implementation that can perform this pair of operations as one BLT would be tasked with recognizing that the batch contained BLTs which can be combined.

The fantastic thing about this approach is that an implementation without the ability to blend N sources in one pass would perform the blends separately, but the result would be identical. Moreover, implementations with the ability to combine different numbers of operations would likewise produce the same results, even they they used a different number of internal steps. Here's an example:

NOTE: As mentioned above a batch of BLTs may be serviced in any number of ways. In this example, the destination buffer may be used for intermediate results, so it is important that this buffer not be used during the batch--i.e. as a displayed buffer.

1. Clients begin by opening one or more BLTsville implementations dynamically. The specific method of doing this is dependent on the operating system. For example, Linux might do this like this:

struct bltsvillelib
{
char* name;
void* handle;
BVFN_MAP bv_map;
BVFN_BLT bv_blt;
BVFN_UNMAP bv_unmap;
};

struct bltsville bvlib[] =
{
{ "libbltsville_cpu.so", 0 },
{ "libbltsville_2d.so", 0 }
};
const int NUMBVLIBS = sizeof(bvlib) / sizeof(struct bltsvillelib);

for(int i = 0; i < NUMLIBS; i++)
{
bvlib[i].handle = dlopen(bvlib[i].name, RTLD_LOCAL | RTLD_LAZY);
bvlib[i].bv_map = (BVFN_MAP)dlsym(bvlib[i].handle, "bv_map");
bvlib[i].bv_blt = (BVFN_BLT)dlsym(bvlib[i].handle, "bv_blt");
bvlib[i].bv_unmap = (BVFN_BLT)dlsym(bvlib[i].handle, "bv_unmap");
}

2. Clients then need to create a bvbuffdesc object for each buffer to be accessed in BLTsville:

Note that the client must ensure that the map element and any additional members in bvbuffdesc are initialized to 0.

3. Next the buffer can be mapped to give the hardware implementations a chance to associate any necessary resources with the buffer:

4. Next the client must create bvsurfgeom objects for each way in which a buffer will be accessed. Often, there is only one way in which a buffer is accessed, so there will be the same number of buffers, bvbuffdesc, and bvsurfgeom objects. If that's the case, it may be convenient for the client to combine them into a parent structure. It may even be possible to share a single bvbuffgeom structure among buffers. Or there will be times when it is necessary to treat a buffer in different ways for different BLTs. Having these two structures separated allows all of these combinations.

Note that the client must ensure that any additional members in bvsurfgeom are initialized to 0 for future compatibility.

5. Now the client is ready to fill in a bvbltparams structure to specify the type of BLT requested. Here is an example of a simple copy from the lower right corner of a surface to the upper left:

struct bvbltparams bltparams = {sizeof(struct bvbltparams), 0};

bltparams.flags = BVFLAG_ROP;
bltparams.op.rop = 0xCCCC; /* SRCCOPY */
bltparams.dstdesc = &buff;
bltparams.dstgeom = &geom;
bltparams.dstrect.left = 0;
bltparams.dstrect.top = 0;
bltparams.dstwidth = width / 2;
bltparams.dstheight = height / 2;
bltparams.src1.desc = &buff;
bltparams.src1geom = &geom;
bltparams.src1rect.left = width / 2;
bltparams.src1rect.top = height / 2;
bltparams.src1rect.width = width / 2;
bltparams.src1rect.height = height / 2;

If the client cannot complete the requested BLT, it returns a bverror indicating the issue.

The kernel mode interface differs only slightly from the user mode interface. Currently there are two differences in the general kernel interface, and one in the Linux/Android interface:

The methods of describing the buffer using physical addresses is not exposed in user mode for security reasons.

struct bvphysdesc {
        unsigned int structsize;
        unsigned long pagesize;
        unsigned long *pagearray;
        unsigned int pagecount;
        unsigned long pageoffset;
};

This member is used for compatibility between BLTsville versions. (See bvbltparams.structsize for an explanation.)

This member indicates the size of the physical pages containing the buffer. BVAT_PHYSDESC/bvphysdesc does not support buffers which reside in pages that are not all the same size. bvphysdesc.pagesize is used to indicate the length of the pages in the bvphysdesc.pagearray as well as the expected alignment of those pages. If this value is 0, the default page size of the system is assumed.

NOTE: When used with physically contiguous buffers, this member should be set to the length of the buffer, which is the same as the value in bvbuffdesc.length.

This member is an array of unsigned longs holding the physical addresses of the pages holding the buffer. The array contains pagecount entries. The specific format of the physical addresses is O/S dependent. However, BVAT_PHYSDESC/bvphysdesc only supports 32-bit physical addresses.

Addresses in this array must be aligned on bvphysdesc.pagesize boundaries. Use the bvphysdesc.pageoffset member to indicate the offset from the start of the first page to the beginning of the buffer.

NOTE: When used with physically contiguous buffers, the first (only) address in this array should be aligned on the system default page boundary, and the bvphysdesc.pageoffset member should be used to indicate the offset from that address to the beginning of the buffer.

This member indicates the number of pages in the array pointed to by bvphysdesc.pagearray.

NOTE: When used with physically contiguous buffers, this member should be set to 1.

This member indicates the number of bytes from the start of the first page (*pagearray) to the start of the buffer. The value must be less than bvphysdesc.pagesize.

Kernel mode entry cannot be the same as the user mode. The specific method of accessing the kernel interface is O/S specific. However, the following interface is currently defined for the specified O/Ss:

Although the linked list used in the bvbuffmap structure is not complicated, there may be a requirement to use the standard Linux/Android kernel linked list in that environment. To facilitate this, the bvbuffmap.map entry is replaced by the following entry for Linux/Android kernel mode only:

This member is used to reference the containing linked list for the bvbuffmap structures associated with the buffer.

	2.0 Client	2.0 Client (w/2.1 Headers)	2.1 Client
2.0 Implementation	compatible	New function and structure definitions have no effect.	Client must deal with lack of bv_cache().
2.0 Implementation (w/2.1 Headers)	New function and structure definitions have no effect.	New function and structure definitions have no effect.	Client must deal with lack of bv_cache().
2.1 Implementation	New function and structures have no effect.	New function and structures have no effect.	compatible

	2.0 Client	2.0 Client (w/2.1 Headers)	2.1 Client
2.0 Implementation	compatible	Client must clear bvbuffdesc using sizeof(bvbuffdesc).	Implementation must handle bvbuffdesc.structsize > sizeof(bvbuffdesc).
2.0 Implementation (w/2.1 Headers)	Implementation must handle bvbuffdesc.structsize < sizeof(bvbuffdesc).	Client must clear bvbuffdesc using sizeof(bvbuffdesc).	Client must deal with implementation that uses bvbuffdesc.virtaddr or returns error if bvbuffdesc.virtaddr is 0.
2.1 Implementation	Implementation must handle bvbuffdesc.structsize < sizeof(bvbuffdesc).	Client must clear bvbuffdesc using sizeof(bvbuffdesc).	compatible

Implementation	Function	Operation
A	bv_blt()	map A BLT A unmap A
A	bv_blt()	map A BLT A unmap A
B	bv_blt()	map B BLT B unmap B

Implementation	Function	Operation
A	bv_map()	map A
A	bv_blt()	BLT A
A	bv_blt()	BLT A
A	bv_unmap()	unmap A

Implementation	Function	Operation
A	bv_map()	map A
B	bv_map()	map B
A	bv_blt()	BLT A
B	bv_blt()	BLT B
A	bv_unmap()	unmap A
B	bv_unmap()	unmap B

Mask	1	1	1	1	1	1	1	1	0	0	0	0	0	0	0
Source 2	1	1	1	1	0	0	0	0	1	1	1	1	0	0	0
Source 1	1	1	0	0	1	1	0	0	1	1	0	0	1	1	0
Destination	1	0	1	0	1	0	1	0	1	0	1	0	1	0	1
Raster Operation	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1

ROP	Constant	Description
BLACKNESS	0x0000	Set all destination bits to black (0). Dest = 0
NOTSRCERASE	0x1111	Dest = ~Src1 & ~Dest = ~(Src1 \| Dest)
NOTSRCCOPY	0x3333	Dest = ~Src1
SRCERASE	0x4444	Dest = Src1 & ~Dest
DSTINVERT	0x5555	Invert (NOT) the destination bits. Dest = ~Dest
PATINVERT	0x5A5A	XOR with Src2. Dest = Src2 ^ Dest
SRCINVERT	0x6666	XOR with Src1. Dest = Src1 ^ Dest
SRCAND	0x8888	Dest = Src1 & Dest
NOP	0xAAAA	Dest = Dest
MERGEPAINT	0xBBBB	Dest = ~Src1 \| Dest
MERGECOPY	0xC0C0	Dest = Src1 & Src2
SRCCOPY	0xCCCC	Dest = Src1
SRCPAINT	0xEEEE	OR with Src1. Dest = Src1 \| Dest
PATCOPY	0xF0F0	Copy source 2 to destination. Dest = Src2
PATPAINT	0xFBFB	Dest = ~Src1 \| Src2 \| Dest
WHITENESS	0xFFFF	Set all destination bits to white (1). Dest = 1

BVSCALE_FASTEST	The fastest method of scaling available is used. This may include nearest neighbor. The value of this enumeration is purposely 0, and is the default scale type. No implementation will return an error for this setting.
BVSCALE_FASTEST_NOT_NEAREST_NEIGHBOR	The fastest method of scaling available that is not nearest neighbor is used. This may include an alternative point sample technique.
BVSCALE_FASTEST_POINT_SAMPLE	The fastest method of scaling using a point sample technique.
BVSCALE_FASTEST_INTERPOLATED	The fastest method of scaling using an interpolation technique.
BVSCALE_FASTEST_PHOTO	The fastest method of scaling appropriate for photographs is used. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_FASTEST_DRAWING	The fastest method of scaling appropriate for drawings is used. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_GOOD	A scaling technique is chosen that may be higher quality than the BVSCALE_FASTEST choice. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_GOOD_POINT_SAMPLE	A point sample scaling technique is chosen that may be higher quality than the BVSCALE_FASTEST_POINT_SAMPLE choice. This may include nearest neighbor.
BVSCALE_GOOD_INTERPOLATED	An interpolated scaling technique is chosen that may be higher quality than the BVSCALE_FASTEST_INTERPOLATED choice.
BVSCALE_GOOD_PHOTO	A scaling technique appropriate for photographs is chosen that may be higher quality than the BVSCALE_FASTEST_PHOTO choice. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_GOOD_DRAWING	A scaling technique appropriate for drawings is chosen that may be higher quality than the BVSCALE_FASTEST_DRAWING choice. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_BETTER	A scaling technique is chosen that may be higher quality than the BVSCALE_GOOD choice. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_BETTER_POINT_SAMPLE	A point sample scaling technique is chosen that may be higher quality than the BVSCALE_GOOD_POINT_SAMPLE choice. This may include nearest neighbor.
BVSCALE_BETTER_INTERPOLATED	An interpolated scaling technique is chosen that may be higher quality than the BVSCALE_GOOD_INTERPOLATED choice.
BVSCALE_BETTER_PHOTO	A scaling technique appropriate for photographs is chosen that may be higher quality than the BVSCALE_GOOD_PHOTO choice. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_BETTER_DRAWING	A scaling technique appropriate for drawings is chosen that may be higher quality than the BVSCALE_GOOD_DRAWING choice. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_BEST	The highest quality scaling technique is chosen. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_BEST_POINT_SAMPLE	The highest quality point sample technique is chosen.
BVSCALE_BEST_INTERPOLATED	The highest quality interpolated scaling technique is chosen.
BVSCALE_BEST_PHOTO	The highest quality scaling technique appropriate for photographs is chosen. This may include nearest neighbor. No implementation will return an error for this setting.
BVSCALE_BEST_DRAWING	The highest quality scaling technique appropriate for drawings is chosen. This may include nearest neighbor. No implementation will return an error for this setting.

BVSCALE_NEAREST_NEIGHBOR	This is a point sample scaling technique where the resampled destination pixel is set to the value of the closest source pixel.
BVSCALE_BILINEAR	This is an interpolated scaling technique where the resampled destination pixel is set to a value linearly interpolated in two dimensions from the four closest source pixels.
BVSCALE_BICUBIC	This is an interpolated scaling technique where the resampled destination pixel is set to a value calculated using cubic interpolation in two dimensions.
BVSCALE_3x3_TAP
BVSCALE_5x5_TAP
BVSCALE_7x7_TAP
BVSCALE_9x9_TAP

BVDITHER_FASTEST	The fastest method of dithering available is used. This may include no dithering (truncation). The value of this enumeration is purposely 0, and is the default dither type. No implementation will return an error for this setting.
BVDITHER_FASTEST_ON	The fastest method of dithering available is used. This will not include no dithering.
BVDITHER_FASTEST_RANDOM	The fastest method of dithering using a random technique.
BVDITHER_FASTEST_ORDERED	The fastest method of dithering using an ordered diffusion technique.
BVDITHER_FASTEST_DIFFUSED	The fastest method of dithering using an error diffusion technique.
BVDITHER_FASTEST_PHOTO	The fastest method of dithering appropriate for photographs is used. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_FASTEST_DRAWING	The fastest method of dithering appropriate for drawings is used. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_GOOD	A dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST choice. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_GOOD_ON	Any dithering technique available is used. This will not include no dithering. This may be higher quality than BVDITHER_FASTEST_ON.
BVDITHER_GOOD_RANDOM	A random dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST_RANDOM choice.
BVDITHER_GOOD_ORDERED	An ordered dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST_ORDERED choice.
BVDITHER_GOOD_DIFFUSED	A diffused dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST_DIFFUSED choice.
BVDITHER_GOOD_PHOTO	A dithering technique appropriate for photographs is chosen that may be higher quality than the BVDITHER_FASTEST_PHOTO choice. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_GOOD_DRAWING	A dithering technique appropriate for drawings is chosen that may be higher quality than the BVDITHER_FASTEST_DRAWING choice. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_BETTER	A dithering technique is chosen that may be higher quality than the BVDITHER_GOOD choice. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_BETTER_ON	Any dithering technique available is used. This will not include no dithering. This may be higher quality than BVDITHER_GOOD_ON.
BVDITHER_BETTER_RANDOM	A random dithering technique is chosen that may be higher quality than the BVDITHER_GOOD_RANDOM choice.
BVDITHER_BETTER_ORDERED	An ordered dithering technique is chosen that may be higher quality than the BVDITHER_GOOD_ORDERED choice.
BVDITHER_BETTER_DIFFUSED	A diffused dithering technique is chosen that may be higher quality than the BVDITHER_GOOD_DIFFUSED choice.
BVDITHER_BETTER_PHOTO	A scaling technique appropriate for photographs is chosen that may be higher quality than the BVSCALE_GOOD_PHOTO choice. No implementation will return an error for this setting.
BVDITHER_BETTER_DRAWING	A scaling technique appropriate for drawings is chosen that may be higher quality than the BVSCALE_GOOD_DRAWING choice. No implementation will return an error for this setting.
BVDITHER_BEST	The highest quality dithering technique is chosen. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_BEST_ON	Any dithering technique available is used. This will not include no dithering. This may be higher quality than BVDITHER_BEST_ON.
BVDITHER_BEST_RANDOM	The highest quality random dithering technique is chosen.
BVDITHER_BEST_ORDERED	The highest quality ordered dithering technique is chosen.
BVDITHER_BEST_DIFFUSED	The highest quality diffused dithering technique is chosen.
BVDITHER_BEST_PHOTO	The highest quality dithering technique appropriate for photographs is chosen. This may include no dithering. No implementation will return an error for this setting.
BVDITHER_BEST_DRAWING	The highest quality dithering technique appropriate for drawings is chosen. This may include no dithering. No implementation will return an error for this setting.

BVDITHER_NONE	No dithering is performed. Internal pixel component values are truncated to the destination component bit depth.
BVDITHER_ORDERED_2x2
BVDITHER_ORDERED_4x4
BVDITHER_ORDERED_2x2_4x4	2x2 ordered dither is used for components with the lowest bit reduction. 4x4 ordered dither is used for the components with the highest bit reduction. (E.g. RGB24 to RGB565 will use 2x2 ordered dither for the green component and 4x4 ordered dither for the red and blue components.)

src1rect	dstrect
(0, 0) - 640 x 480	(240, 0) - 1440 x 1080

src1rect	dstrect
(0, 0) - 640 x 133.333...	(240, 0) - 1440 x 300
(0, 133.333...) - 284.444... x 213.333...	(240, 300) - 400 x 480
(568.888..., 133.333...) - 284.444... x 213.333...	(1280, 300) - 400 x 480
(0, 346.666...) - 640 x 133.333...	(240, 780) - 1440 x 300

src1rect	dstrect	cliprect
(0, 0) - 640 x 480	(240, 0) - 1440 x 1080	(240, 0) - 1440 x 300
(0, 0) - 640 x 480	(240, 0) - 1440 x 1080	(240, 300) - 400 x 480
(0, 0) - 640 x 480	(240, 0) - 1440 x 1080	(1280, 300) - 400 x 480
(0, 0) - 640 x 480	(240, 0) - 1440 x 1080	(240, 780) - 1440 x 300

BVBATCH_OP	indicates that the operation type (BVFLAG_ROP, BVFLAG_BLEND, BVFLAG_FILTER, etc.) has changed.
BVBATCH_KEY	indicates that the bvbltparams.colorkey or the color key mode (BVFLAG_KEY_SRC/BVFLAG_KEY_DST) has changed.
BVBATCH_MISCFLAGS	indicates that bvbltparams.flags other than the operation, color key, or clip flag have changes.
BVBATCH_ALPHA	indicates that bvbltparams.globalalpha or global alpha type has changed.
BVBATCH_DITHER	indicates that bvbltparams.dithermode has changed.
BVBATCH_SCALE	indicates that bvbltparams.scalemode has changed.
BVBATCH_DST	indicates that the destination surface (bvbltparams.dstdesc, bvbltparams.dstgeom ,or bvbltparams.dstrect) has changed.
BVBATCH_SRC1	indicates that the source 1 surface (bvbltparams.src1.desc or bvbltparams.src1.tileparams, or bvbltparams.src1geom) has changed.
BVBATCH_SRC2	indicates that the source 2 surface (bvbltparams.src2.desc or bvbltparams.src2.tileparams, or bvbltparams.src2geom) has changed.
BVBATCH_MASK	indicates that the mask surface (bvbltparams.mask.desc or bvbltparams.mask.tileparams, or bvbltparams.maskgeom) has changed.
BVBATCH_DSTRECT_ORIGIN	indicates that bvbltparams.dstrect.left or top has changed.
BVBATCH_DSTRECT_SIZE	indicates that the bvbltparams.dstrect.width or height has changed.
BVBATCH_SRC1RECT_ORIGIN	indicates that bvbltparams.src1rect.left or top has changed.
BVBATCH_SRC1RECT_SIZE	indicates that the bvbltparams.src1rect.width or height has changed.
BVBATCH_SRC2RECT_ORIGIN	indicates that bvbltparams.src2rect.left or top has changed.
BVBATCH_SRC2RECT_SIZE	indicates that the bvbltparams.src2rect.width or height has changed.
BVBATCH_MASKRECT_ORIGIN	indicates that bvbltparams.maskrect.left or top has changed.
BVBATCH_MASKRECT_SIZE	indicates that the bvbltparams.maskrect.width or height has changed.
BVBATCH_CLIPRECT_ORIGIN	indicates that bvbltparams.cliprect.left or top has changed.
BVBATCH_CLIPRECT_SIZE	indicates that the bvbltparams.cliprect.width or height has changed.
BVBATCH_TILE_SRC1	indicates that the bvbltparams.src1.tileparams has changed.
BVBATCH_TILE_SRC2	indicates that the bvbltparams.src2.tileparams has changed.
BVBATCH_TILE_MASK	indicates that the bvbltparams.mask.tileparams has changed.
BVBATCH_ENDNOP	is a special flag used with BVFLAG_BATCH_END, for clients that do not have information that a batch is ending until after the last BLT has been issued. When this flag is set, no BLT is done, but the batch is ended.

BVCACHE_BIDIRECTIONAL	(This usually performs a cache flush operation.)
BVCACHE_CPU_TO_DEVICE	Performs the appropriate cache operation to ensure data can be transferred correctly when it was written with the CPU, but will be read by the 2-D device. (This is usually a cache clean operation.)
BVCACHE_CPU_FROM_DEVICE	Performs the appropriate cache operation to ensure data can be transferred correctly when it was written by the 2-D device, but will be read by the CPU. (This is usually a cache invalidate operation.)

BVTILE_LEFT_REPEAT	indicates that the tile is repeated to the left of the destination alignment location.
BVTILE_TOP_REPEAT	indicates that the tile is repeated above the destination alignment location.
BVTILE_RIGHT_REPEAT	indicates that the tile is repeated to the right of the destination alignment location.
BVTILE_BOTTOM_REPEAT	indicates that the tile is repeated below the destination alignment location.
BVTILE_LEFT_MIRROR	indicates that the tile is mirrored to the left of the destination alignment location.
BVTILE_TOP_MIRROR	indicates that the tile is mirrored above the destination alignment location.
BVTILE_RIGHT_MIRROR	indicates that the tile is mirrored to the right of the destination alignment location.
BVTILE_BOTTOM_MIRROR	indicates that the tile is mirrored below the destination alignment location.

BVERR_OP_FAILED	The operation failed for unspecified reasons. The destination buffer was not modified.
BVERR_OP_INCOMPLETE	The operation only partially completed. The destination buffer is in an undefined state.
BVERR_MEMORY_ERROR	The operation resulted in a memory error, most likely due to an attempt to access invalid memory. The destination buffer is in an undefined state.

a.	This step is actually optional, as indicated above. However, if the client does not explicitly call bv_map(), the mapping must be done by the implementation to associate the necessary resources with the buffer. So this mapping must be done later, when bv_blt() is called. Additionally, since the client did not call bv_map(), it is unlikely that the client will call bv_unmap() to allow the implementation to free the resources associated with the buffer. So the implementation will internally unmap the resources after completing the BLT. This means that the mapping and unmapping overhead will be encountered on every call to bv_blt(). In general, the CPU implementations have (almost) no overhead associated with mapping and unmapping. So opting not to make the bv_map() call for CPU implementations is likely to have negligible difference in bv_blt() performance.
b.	Calling bv_map() once for each buffer is enough to tell the implementations that the client can be trusted to call bv_unmap() when work with the buffer is complete, as indicated above. It does not matter which implementation's bv_map() is called. However, that implementation is the only one which will perform the mapping immediately. All other implementations will perform a lazy mapping only when their bv_blt() call is invoked. This allows the client to avoid the overhead of mapping and unmapping the buffers on each bv_blt() call. It also avoids the associated mapping and unmapping overhead if a given implementation is never used. As mentioned above, the CPU implementations have (almost) no overhead associated with mapping and unmapping, so they are a good choice to use for the call to bv_map().
c.	If the client wants direct control over the mapping and unmapping overhead, it can call the bv_map() function of each implementation, as indicated above. Each implementation will perform the mapping at that time, so that the overhead will not appear on subsequent calls to bv_blt().

bvbuffdesc.auxtype	bvbuffdesc.auxptr type	Notes
BVAT_PHYSDESC	bvphysdesc	Used to specify the physical pages of a physically discontiguous buffer constructed using a single page size. This may be used with physically contiguous buffers as well, but BVAT_PHYSADDR is preferred.
BVAT_PHYSADDR	physical address	Used to specify the starting physical address of a physically contiguous buffer.