Version 2.2

BLTsville is the open 2-D API designed to provide an abstract interface for both hardware and software 2-D implementations.

BLTs (BLock Transfers) involve the moving around of blocks (rectangles) of pixels.  BLTsville is the place to go for BLTs.


License

The API is designed and maintained by Texas Instruments, Inc., but anyone is free to use it with no cost or obligation.

This project is licensed under the Creative Commons Attribution-NoDerivs 3.0 Unported License (user mode), and the GNU General Public License version 2 (kernel mode).


Dependencies

This project is dependent on the Open Color format Defintions (OCD) project.


Source

Get the source code (headers) from GitHub at github.com/graphics/bltsville, or download the project in zip or tar format.

You can also clone the project with Git by running:

$ git clone git://github.com/graphics/bltsville

Wiki
https://github.com/graphics/bltsville/wiki

Points of Interest in BLTsville

  • Solid fills
  • Pattern fills
  • Copies
  • Color format conversion
    • Extensive color format support
      • RGB, BGR
      • RGBA, ARGB, etc.
      • YCbCr (YUV)
        • subsampling
        • packed
        • planar
      • Monochrome
      • Alpha-only
      • Look-Up Table (LUT)
    • Extensible color format
  • ROP4
    • Three inputs
  • Blends
    • Pre-defined Porter-Duff blends
    • Pre-defined DirectFB support
    • Extensible blends
  • Multiple
  • Filters
    • Extensible filters
  • Independent horizontal and vertical flipping
  • Independent scaling of all three inputs
  • Clipping
  • Independent rotation of all three inputs (multiples of 90 degrees)
  • Choice of scaling type
    • Quality based choice
    • Speed based choice
    • Image type based choice
    • Specific scale type choice
    • Extensible scale type
  • Synchronous operations
  • Asynchronous operations
    • Client notification of BLT completion
  • Batching
    • Combine multiple BLTs into group that can be handled more efficiently by implementations
      • Character BLTs
      • Multi-layer blending
      • ROP/Blend combination with specified ordering
      • etc.
    • Delta BLTs
  • Dithering
    • Quality based choice
    • Speed based choice
    • Image type based choice
    • Specific dither type choice
    • Extensible dither type
  • Any implementation support
    • CPU
    • 2-D Accelerator


How to Get to BLTsville

BLTsville's API is defined in the BLTsville header files.  A client must include bltsville.h to access the implementations.  This header includes the remaining headers (including ocd.h).

NOTE:  The bvinternal.h header is for implementations only and should not be used by clients.

BLTsville has both user mode and a kernel mode interaces.  The kernel mode interface is quite similar to (and compatible with) the user mode, but due to the minor differences and license issues, there are two different sets of header files.


History of BLTsville


Versions 1.x

BLTsville was based on a previous closed interface, which had a few implementations and shipped on a few devices.  That interface represented the 1.x versions.  A lot was learned from that work, and these lessons were used in the founding of BLTsville.

Version 2.0

This was the initial release of the user mode interface.  This version is not compatible with the 1.x versions.  Several minor updates were posted, but the API itself did not change, so no changes to the client or implementation were required.

Version 2.1

This is a minor update to the API, and it adds the kernel mode interface.  Some additions to the API have been made.  Details of the changes are below with their compatibility matrices.

  2.0 Client 2.0 Client
(w/2.1 Headers)
2.1 Client
2.0 Implementation compatible New function and structure definitions have no effect. Client must deal with lack of bv_cache().
2.0 Implementation
(w/2.1 Headers)
New function and structure definitions have no effect. New function and structure definitions have no effect. Client must deal with lack of bv_cache().
2.1 Implementation New function and structures have no effect. New function and structures have no effect. compatible
  2.0 Client 2.0 Client
(w/2.1 Headers)
2.1 Client
2.0 Implementation compatible Client must clear bvbuffdesc using sizeof(bvbuffdesc). Implementation must handle bvbuffdesc.structsize > sizeof(bvbuffdesc).
2.0 Implementation
(w/2.1 Headers)
Implementation must handle bvbuffdesc.structsize < sizeof(bvbuffdesc). Client must clear bvbuffdesc using sizeof(bvbuffdesc). Client must deal with implementation that uses bvbuffdesc.virtaddr or returns error if bvbuffdesc.virtaddr is 0.
2.1 Implementation Implementation must handle bvbuffdesc.structsize < sizeof(bvbuffdesc). Client must clear bvbuffdesc using sizeof(bvbuffdesc). compatible

Version 2.2

This is a minor update which includes the following:

Compatibility


BLTsville Neighborhoods

Implementations may be software (CPU) or 2-D hardware, and many may coexist.  Each implementation will have an individual entry point, so it can be directly addressed.  But there will also be a more general interface for each of these two types of implementations so that system integrators can choose the most appropriate implementation.  In other words, the system integrator will choose one software and one 2-D hardware implementation to be the "default" used when a client does not need to choose a specific implementation.

User Mode Interface

Clients use the standard names below to access the default implementations.  The client then imports the pointers to the functions.  (The specific name decoration and import method will be dictated by the host Operating System (O/S).)  Some examples:

Usually these entry points will be symbolic links (either explicit in systems like Linux which support them, or implicit using a thin wrapper) to the specific implementation.  This allows system integrators to connect the client with the most capable implementation available in the system.  For example, bltsville_hw2d might be a symbolic link to bltsville_gc2d.

In addition, there may be more implementations co-existing in a given system.  These will have additional unique names as determined by the vendors.  For example:

Initialization

In general, each O/S has the ability to manually load a library.  This in turn causes a function in the library to be called so the library can perform initialization.  Unfortunately, not all O/Ss allow this initialization function to return an error if the initialization fails.  Equally unfortunately, it may be necessary for the initialization to be performed in that function.  To accommodate this, BLTsville defers the specific initialization to the O/S environment.

Linux/Android

The client will call dlopen() to open the library.  It will then import the bv_*() functions, and call them as desired.  Initialization will occur in association with one or more of these activities.  If the initialization fails, the bv_*() functions will return the BVERR_RSRC error, indicating that a required resource was not obtained.

Implementations Only

If the library has designated a function with the __attribute__ ((constructor)), that function will be called.  Linux implementations may use this function to perform initialization (including opening an interface to an associated kernel module).  However, since this function cannot return an error, and thus cannot fail, if the initialization fails, this must be recorded.  Then, when the client calls any of the bv_*() functions, these should immediately return the BVERR_RSRC error, indicating that the library was unable to initialize (obtain a necessary resource).

Linux implementations may also choose to initialize on the first call to a bv_*() function.  Failure is likewise indicated by returning the BVERR_RSRC error.

NOTE:  Be careful not to repeatedly attempt initialization when a failure is encountered.  Some initializations, and especially initialization failures, can take a long time.  This means clients trying to call bv_*() functions (presumably before falling back to alternatives) will be repeatedly penalized if the library can't initialize.  Instead, attempt initialization once, and from them on return BVERR_RSRC.

Kernel Mode Interface

For most kernel space BLTsville clients, only a 2-D hardware implementation will be used.  However, both types of implementations are supported.  Clients use the standard names below to access the default implementations and obtain pointers to the functions.  (The specific method of obtaining the interface will be dictated by the host Operating System (O/S).)  Some examples:

These entry points may represent the implementations themselves, but more likely they will link the client to the implementations using more specific names.  For example, bv2d_entry() may link the client to gcbv_entry().

In addition, there may be more implementations co-existing in the kernel.  These will require additional unique names as determined by the vendors.  For example:


Things To Do In BLTsville

BLTsville's interface consists of three or four functions per implementation, which must be imported by the client at run time:

NOTE:  If the library failed to initialize, these functions will return BVERR_RSRC, indicating that a required resource was not obtained.

bv_map()

enum bverror bv_map(struct bvbuffdesc* buffdesc);

BLTsville does not allocate buffers.   Clients must describe a buffer in BLTsville using the bvbuffdesc structure so a given implementation can access the buffer.

bv_map() is used to provide the implementation an opportunity to associate hardware resources with the specified buffer.  Most hardware requires this type of mapping, and there is usually appreciable overhead associated with it.  By providing a separate call for this operation, BLTsville allows the client to move this overhead to the most appropriate time in its execution.

For a given buffer, the client can call the bv_map() function imported from each implementation to establish the mapping immediately.  But this is not required.

As a special bonus, BLTsville clients can call to any implementation's bv_map().  This is sufficient to indicate that the client can be trusted to make the corresponding call to bv_unmap() upon destruction of the buffer.  Then when a client calls an implementation's bv_blt(), if the mapping needs to be done, it's done at that time.  But the mapping is maintained, so that the overhead is avoided on subsequent bv_blt() calls.  This lets implementations use lazy mapping only as necessary.  If an implementation is not called, the mapping is not done.

Normally, the lowest overhead bv_map() call will be in the CPU-based implementation.  So most clients will want to make a single, low overhead bv_map() call to the bltsville_cpu implementation to avoid the mapping/unmapping overhead on each bv_blt() call, while avoiding the mapping overhead when possible.

Calling bv_map() is actually optional prior to calling bv_blt().  However, if it is not called at least once for a given buffer, it must be assumed that bv_unmap() will not be called.  So the mapping must be done when bv_blt() is called, and unmapping done when it is complete.  This means the overhead will be incurred for every bv_blt() call which uses that buffer.

NOTE: Obviously any API cannot add capabilities beyond an implementation's capabilities.  So, for example, if an implementation requires memory to be allocated from a special pool of memory, that responsibility falls upon the client.  The bv_map() function for that implementation will need to check the characteristics of the memory and return an error if it does not meet the necessary criteria.

Function Sequences

To clarify, here are some function sequences and the operations associated with them:

Implementation Function Operation
A bv_blt() map A
BLT A
unmap A
A bv_blt() map A
BLT A
unmap A
B bv_blt() map B
BLT B
unmap B

Implementation Function Operation
A bv_map() map A
A bv_blt() BLT A
A bv_blt() BLT A
A bv_unmap() unmap A

Implementation Function Operation
A bv_map() map A
B bv_map() map B
A bv_blt() BLT A
B bv_blt() BLT B
A bv_unmap() unmap A
B bv_unmap() unmap B

Implementation Function Operation
A bv_map() map A
B bv_blt() map B
BLT B
B bv_blt() BLT B
A bv_unmap() unmap A
unmap B

NOTE:  Calling bv_map() and bv_unmap() with the same bvbuffdesc from different, unsynchronized threads, even (especially) from different implementations, will result in undefined behavior.  This is similar to calling malloc() and free() using the same buffer pointer in different, unsynchronized threads.  While this may work sometimes and for some implementations and combinations of implementations, BLTsville does not provide any synchronization mechanism to make this safe.  Clients must ensure that these calls are synchronized in cases where such behavior appears to be necessary.

bv_blt()

enum bverror bv_blt(struct bvbltparams* bltparams);

The main function of BLTsville is bv_blt().  A bvbltparams structure is passed into bv_blt() to trigger the desired 2-D operation.

bv_unmap()

enum bverror bv_unmap(struct bvbuffdesc* buffdesc);

bv_unmap() is used to free implementation resources associated with a buffer.  Normally, if bv_map() was called for a given buffer, bv_unmap() should be called as well.

For convenience, only one bv_unmap() needs to be called for each buffer, regardless of how many implementations were used, including multiple calls to bv_map().

Also for convenience, bv_unmap() may be called multiple times on the same buffer.  Note that only the first call will actually free (all) the associated resources.  See the Function Sequences under bv_map() for more details.

Implementations Only

Implementations must ensure that unmapping of buffers which are in use by asynchronous BLTs are appropriately delayed to avoid improper access.

bv_cache()

enum bverror bv_cache(struct bvcopparams *copparams);

bv_cache() provides manual CPU cache control to maintain cache coherence of surfaces between the CPU and other hardware.  The bvcopparams structure provides the information needed to properly manipulate the CPU cache.

This function is optional.  If this function fails to import, it means the implementation does not provide it, but bv_map() bv_blt(), and bv_unmap() may still be used.

In general, this function will be provided with BLTsville implementations which utilize 2-D hardware, even though it manipulates the CPU cache.  This is because most systems require a kernel module to manipulate the cache, and this is not always practical to include with a user-mode CPU implementation.

BEWARE:  Manipulation of the CPU cache is tricky.  Moreover, different CPUs behave differently, so cache manipulation that works on one device may fail on another.  Also, mismanaged operation of the cache can have significant impact on overall system performance.  And incorrect manipulation of the cache can cause instability or crashes.  Please read and understand all of the discussions below before using this function.

  1. To avoid system instability, do not perform cache operations on buffers which would not be accessed by BLTsville.
  2. For maximum performance, combine adjacent rectangles into one bv_cache() call.  For example, when BLTing a line of characters, do not issue a bv_cache() call for each character.  Instead, make one call to bv_cache() which includes all the characters.
  3. When using a hardware BLTsville implementation to read data written into a cached surface by the CPU, use the BVCACHE_CPU_TO_DEVICE operation after the CPU has completed its operation and before the hardware BLTsville operation is initiated.
  4. When using a hardware BLTsville implementation to write data into a cached surface that will be read by the CPU, use the BVCACHE_CPU_FROM_DEVICE operation after the hardware BLTsville operation has completed (note this means after the callback if the BLT is asynchronous) and before the CPU accesses the surface.
  5. When using a hardware BLTsville implementation to write data into a cached surface that has been written by the CPU, using the BVCACHE_CPU_TO_DEVICE operation after the CPU has completed its operation and before the hardware BLTsville operation is initiated.

Example:  On one particular device, a surface was allocated using the standard user mode malloc().  An image was copied into a portion of this surface using a hardware implementation of BLTsville.  The result was then read by the CPU.

Logically, bv_cache() was used to perform a BVCACHE_CPU_FROM_DEVICE operation after the hardware-based BLTsville operation completed, but before the CPU read was performed.  However, corruption appeared both inside the image copied, as well as outside the image!

Both corruptions were caused by not realizing that there was a CPU operation (clear) performed on behalf of the malloc(), for which the proper cache manipulation was not performed.

The corruption outside the image was due to data in the cache being invalidated before it reached the memory.  As mentioned above, buffers allocated are normally cleared by the system.  In this case, since the buffer used for the surface was configured with a write allocated cache, this meant that not all writes to clear the buffer were in memory when the  BVCACHE_CPU_FROM_DEVICE operation was performed.  As a result, the uncommitted data in the cache was invalidated and lost, and the previous contents of the memory remained for the CPU to read.

The corruption inside the image was caused by data in the cache being committed to memory after the hardware BLT completed, but before the BVCACHE_CPU_FROM_DEVICE operation was executed.

Both corruptions were corrected by performing a BVCACHE_CPU_TO_DEVICE operation on the destination surface before performing the BLT (item 5 above), in addition to the BVCACHE_CPU_FROM_DEVICE operation performed after the BLT (item 3 above).



bvbltparams

bvbltparams is the central structure in BLTsville.  This structure holds the details of the BLT being requested by the client.

union bvop {
        unsigned short rop;
        enum bvblend blend;
        struct bvfilter *filter;
};

struct bvinbuff {
        struct bvbuffdesc *desc;
        struct bvtileparams *tileparams;
};

struct bvbltparams {
        unsigned int structsize;

        char *errdesc;

        unsigned long implementation;
        unsigned long flags;
        union bvop op;

        void *colorkey;
        union bvalpha globalalpha;

        enum bvscalemode scalemode;
        enum bvdithermode dithermode;

        struct bvbuffdesc *dstdesc;
        struct bvsurfgeom *dstgeom;
        struct bvrect dstrect;

        union bvinbuff src1;
        struct bvsurfgeom *src1geom;
        struct bvrect src1rect;

        union bvinbuff src2;
        struct bvsurfgeom *src2geom;
        struct bvrect src2rect;

        union bvinbuff mask;
        struct bvsurfgeom *maskgeom;
        struct bvrect maskrect;

        struct bvrect cliprect;

        unsigned long batchflags;
        struct bvbatch *batch;

        void (*callbackfn)(struct bvcallbackerror *err,
                           unsigned long callbackdata);
        unsigned long callbackdata;

        struct bvrect src2auxdstrect;
        struct bvrect maskauxdstrect;
};

bvbltparams.structsize

unsigned long structsize; /* input */

This member is used to allow backwards and forwards compatibility between versions of BLTsville.  It should be set to the sizeof() the structure by the client or implementation, whichever allocated the structure.

BLTsville is designed to be forwards and backwards compatible between client and library versions.  But this compatibility would be eliminated if clients chose to check for a specific version of the BLTsville implementations and fail if the specific version requested was not in place.  So, instead of exporting a version number, BLTsville structures use the structsize member to indicate the number of bytes in the structure.  This is used to communicate between the client and implementation which portions of the structure exist.  This effectively bypasses the concept of a version and focuses on the specifics of what changes need to be considered to maintain compatibility.

  1. When an old client calls into a new implementation, that implementation will realize if the client only provides a subset of an updated structure.  The implementation will handle this and utilize only that information which has been provided.  New features will be disabled, but functionality will be maintained.
  2. When a new client calls into an old implementation, that implementation will ignore the extra members of the structure and operate in ignorance of them.  If these members are necessary for some new functionality, this will be evident from other fields in the structure, so that the implementation can gracefully fail.

If structsize is set to a value that is too small for an implementation, it may return a BVERR_BLTPARAMS_VERS error.

bvbltparams.errdesc

char* errdesc; /* output */

errdesc is optionally used by implementations to pass a 0-terminated string with additional debugging information back to clients for debugging purposes.  errdesc is not localized or otherwise meant to provide information that is displayed to users.

bvbltparams.implementation

unsigned long implementation; /* input */

Multiple implementations of BLTsville can be combined under managers which can distribute the BLT requests to the implementations based on whatever criteria the manager chooses.  This might include availability of the operation, performance, loading, or power state.  In such a scenario, the client may need to override or augment the choice made by the manager.  This field allows that control.

Note that this feature is extremely complicated, and more detailed documentation needs to be created to allow creation of managers and smooth integration by a client.  There are serious issues that must be understood before any manager can be put into place, such as CPU cache coherence and multiple implementation operation interdependence.  For now, this field should be set to 0 by clients.

If the implementation cannot respond to the implementation flags set, it may return a BVERR_IMPLEMENTATION error.

bvbltparams.flags

unsigned long flags; /* input */

The flags member provides the baseline of information to bv_blt() about the type of BLT being requested.

To maintain compatibility, unused bits in the flags member should be set to 0.

If the flags set are not supported by the implementation, it may return BVERR_FLAGS, or a more specific error code.

bvbltparams.flags - BVFLAG_OP_*

The op field of the flags member specifies the type of BLT operation to perform.  Currently there are three types of BLT operations defined:

1. BVFLAG_ROP

This flag indicates the operation being performed is a raster operation, and the bvbltparams.op union is treated as rop.  Raster OPerations are binary operations performed on the bits of the inputs.  See bvbltparams.op.rop for details.

2.

BVFLAG_BLEND

This flag indicates the operation being performed is a blend, and the bvbltparams.op union is treated as blend.  Blending involves mixing multiple layers of pixels using the specified equations.  Surrounding pixels are not involved in blend operations.  See bvbltparams.op.blend for details.

3. BVFLAG_FILTER

This flag indicates the operation being performed is a filter, and the bvbltparams.op union is treated as filter.  Filtering involves mixing multiple layers of pixels.  Surrounding pixels are involved in filter operations.  See bvbltparams.op.filter for details.

bvbltparams.flags - BVFLAG_KEY_SRC/DST

The BVFLAG_KEY_SRC and BVFLAG_KEY_DST enable source and destination color keying, respectively.  When either flag is set, the colorkey member of bvbltparams is used.

BVFLAG_KEY_SRC and BVFLAG_KEY_DST are mutually exclusive.

See bvbltparams.colorkey for details.

bvbltparams.flags - BVFLAG_CLIP

When BVFLAG_CLIP is set, the cliprect member of bvbltparams is used by the implementation as a limiting rectangle on data written to the destination.  See cliprect for details.

bvbltparams.flags - BVFLAG_SRCMASK

Normally, the mask is applied at the destination, after all scaling has been completed (including scaling the mask if necessary).  But some environments require that the mask be applied at the sources, before scaling occurs.  The BVFLAG_SRCMASK flag requests that the implementation use this method if supported.

bvbltparams.flags - BVFLAG_TILE_*

Normally, when a source's size does not match the destination, the source is scaled to fill the destination.  But when the corresponding BVFLAG_TILE_* flag is set, this behavior is modified.

First, the source's size specifies a tile (or pattern, or brush) to be used to fill the destination.  This tile is replicated instead of scaled.

The origin of the source's rectangle is used to locate the tile within a larger surface.

Second, a bvbuffdesc object is no longer supplied by the client in the bvbltparams structure.  In its place is a bvtileparams object.

Refer to the bvtileparams structure definition for details.

bvbltparams.flags - BVFLAG_HORZ/VERT_FLIP_*

These flags indicate that the corresponding image is flipped horizontally or vertically as it is used by the operation.

bvbltparams.flags - BVFLAG_SCALE/DITHER_RETURN

The scale and dither types can be specified with an implicit type.  The implementation will then convert that internally to an explicit scale or dither type.  These flags request that the implementation return the explicit type chosen to the client in the corresponding bvbltparams.scalemode and bvbltparams.dithermode members.

bvbltparams.flags - BVFLAG_ASYNC

This flag allows the client to inform the implementation that it can queue the requested BLT and return from bv_blt() before it has completed.  If this bit is not set, when the bv_blt() returns, the operation is complete.

Normally, a client will also utilize the bvbltparams.callbackfn and bvbltparams.callbackdata members to receive a notification when the BLT has completed.

NOTE:  Asynchronous BLTs are performed in the order in which they are submitted within an implementation.  This was done to provide a simple dependency mechanism.  However, synchronization between implementations must be handled by the client, using the callback mechanism.

NOTE:  Since asynchronous BLTs are performed in the order in which they are submitted, it follows that a synchronized BLT after a set of asynchronous BLTs may be used as synchronization as well.

NOTE:  Certain situations may require manual synchronization without an associated BLT.  Rather than introduce an additional BLTsville function call, the method of handling this will be via a NOP BLT.  To accomplish a NOP BLT, the client should issue a BLT using the bvbltparams.op.rop code of 0xAAAA (copy destination to destination), and with the BVFLAG_ASYNC flag not set.  Alternatively, the NOP BLT may set the BVFLAG_ASYNC and provide a bvbltparams.callbackfnTo facilitate implementations, a valid destination surface should be specified.

Implementations Only

In general, this BLTsville specification has avoided placing any requirement on implementations for specific operations.  However, in support of this special case, support for these NOP BLTs will need to be an implementation requirement.

bvbltparams.flags - BVFLAG_BATCH_BEGIN/CONTINUE/END

These flags are used to control batching of BLTs for two main reasons:

  1. To group small, similar BLTs to consolidate overhead.  For example, the BLTs associated with rendering each character in a word.
  2. To group related BLTs, which may allow an implementation to perform a more efficient, but equivalent set of operations.

See Batching for details.

bvbltparams.flags - BVFLAG_SRC2/MASK_AUXDSTRECT

These flags are used to indicate that the bvbltparams.src2auxdstrect and bvbltparams.maskauxdstrect are to be used.  See these entries below for details. These flags are likely to be ignored except for the special case explained below, so they should be used only when necessary.

bvbltparams.op.rop

unsigned short op; /* input */

When BVFLAG_ROP is set in the bvbltparams.flags member, the bvbltparams.op union is treated as rop.  Raster OPerations are binary operations performed on the bits of the inputs:

BLTsville's rop element is used to specify a ROP4, but anything from ROP1 up to ROP4 can be defined using this member:

NOTE:  By far the most common ROP used will be 0xCCCC, which indicates a simple copy from source 1 to the destination.

The table below is the magic decoder ring:

Mask  1   1   1   1   1   1   1   1   0   0   0   0   0   0   0   0 
Source 2  1   1   1   1   0   0   0   0   1   1   1   1   0   0   0   0 
Source 1  1   1   0   0   1   1   0   0   1   1   0   0   1   1   0   0 
Destination  1   0   1   0   1   0   1   0   1   0   1   0   1   0   1   0 
Raster Operation  15   14   13   12   11   10    9    8    7    6    5    4    3    2    1    0 

For example, to specify an operation that uses the mask to choose between source 1 and destination (source 1 when mask is 1, destination when mask is 0), a client would calculate the bottom line by parsing each column:

When mask is 1 (the first eight columns), the rop matches the source 1 row.  When mask is 0 (the last eight columns), the rop matches the destination row.

Raster Operation  1   1   1   1   0   0   0   0   1   0   1   0   1   0   1   0 

So the rop for this operation would be 0xF0AA.

Here is a list of some commonly used raster operations that have been given names:

ROP Constant Description
BLACKNESS 0x0000 Set all destination bits to black (0).  Dest = 0
NOTSRCERASE 0x1111 Dest = ~Src1 & ~Dest = ~(Src1 | Dest)
NOTSRCCOPY 0x3333 Dest = ~Src1
SRCERASE 0x4444 Dest = Src1 & ~Dest
DSTINVERT 0x5555 Invert (NOT) the destination bits.  Dest = ~Dest
PATINVERT 0x5A5A XOR with Src2.  Dest = Src2 ^ Dest
SRCINVERT 0x6666 XOR with Src1.  Dest = Src1 ^ Dest
SRCAND 0x8888 Dest = Src1 & Dest
NOP 0xAAAA Dest = Dest
MERGEPAINT 0xBBBB Dest = ~Src1 | Dest
MERGECOPY 0xC0C0 Dest = Src1 & Src2
SRCCOPY 0xCCCC Dest = Src1
SRCPAINT 0xEEEE OR with Src1.  Dest = Src1 | Dest
PATCOPY 0xF0F0 Copy source 2 to destination.  Dest = Src2
PATPAINT 0xFBFB Dest =  ~Src1 | Src2 | Dest
WHITENESS 0xFFFF Set all destination bits to white (1).  Dest = 1

bvbltparams.op.blend

enum bvblend blend; /* input */

When BVFLAG_BLEND is set in the bvbltparams.flags member, the bvbltparams.op union is treated as a blend.

To specify the blend, the client fills in blend with one of the bvblend values.

bvblend is an enumeration assembled from sets of fields.  The values specified may be extended beyond those that are explicitly defined using the definitions in the bvblend.h header file.

The first 4 bits are the format.  Currently two format groups are defined, but others can be added.  The remainder of the bits are used as defined by the individual format:

1. BVBLENDDEF_FORMAT_CLASSIC

The BVBLENDDEF_FORMAT_CLASSIC is meant to handle the classic Porter-Duff equations. It can also handle the DirectFB blending.

BVBLENDDEF_FORMAT_CLASSIC is based on the following equations:

Cd = K1C1 + K2C2
Ad = K3A1 + K4A2

where:

Cd: destination color
C1: source 1 color
C2: source 2 color
Ad: destination alpha
A1: source 1 alpha
A2: source 2 alpha
K#: one of the constants defined using the bitfields below

The 28 bits for BVBLENDDEF_FORMAT_CLASSIC are divided into 5 sections.

The most significant 4 bits are modifiers, used to include additional alpha values from global or remote sources.

[27] The most significant bit indicates that a remote alpha is to be included in the blend. The format of this is defined by bvbltparams.maskgeom.format.

[26] The next bit is reserved.

[25:24] The next 2 bits are used to indicate that a global alpha is to be included, and what its format is:

00: no global included
01: global included; bvbltparams.globalalpha.size8 is used (0 -> 255)
10: this value is reserved
11: global included; bvbltparams.flogalalpha.fp is used (0.0 -> 1.0)

The remaining bits are divided into 4 sections, one to define each of the constants:

[23:18] - K1
[17:12] - K2
[11:6] - K3
[5:0] - K4

The format is the same for all 4 constant fields:

[5:4] The first 2 bits of each field indicates the way in which the other 2 fields are interpreted:

00: only As: the other two fields contain only As; there should be only one valid A value between the two fields
01: minimum: the value of the constant is the minimum of the two fields
10: maximum: the value of the constant is the maximum of the two fields
11: only Cs: the other two fields contain only Cs; there should be only one valid C value between the two fields

[3:2] The middle 2 bits of each field contain the inverse field:

00: 1-C1 ("don't care" for "only As")
01: 1-A1 ("don't care" for "only Cs")
10: 1-C2 ("don't care" for "only As")
11: 1-A2 ("don't care" for "only Cs")

[1:0] The last 2 bits if each field contain the normal field:

00: C1 ("don't care" for "only As")
01: A1 ("don't care" for "only Cs")
10: C2 ("don't care" for "only As")
11: A2 ("don't care" for "only Cs")

EXCEPTIONS:

00 00 00 - The value 00 00 00, which normally would indicate "only As" with two "don't care" fields, is interpreted as a constant of 0.

11 11 11 - The value 11 11 11, which normally would indicate "only Cs" with two "don't care" fields, is interpreted as a constant of 1.

Constants

Put together, these can define portions of the blend equations that can be put together in a variety of ways:

00 00 00 undefined -> zero
00 00 01 A1 (preferred)
00 00 10 undefined
00 00 11 A2 (preferred)
00 01 00 1-A1 (preferred)
00 01 01 undefined
00 01 10 1-A1 (use 00 01 00)
00 01 11 undefined
00 10 00 undefined
00 10 01 A1 (use 00 00 01)
00 10 10 undefined
00 10 11 A2 (use 00 00 11)
00 11 00 1-A2 (preferred)
00 11 01 undefined
00 11 10 1-A2 (use 00 11 00)
00 11 11 undefined
01 00 00 min(C1,1-C1)
01 00 01 min(A1,1-C1)
01 00 10 min(C2,1-C1)
01 00 11 min(A2,1-C1)
01 01 00 min(C1,1-A1)
01 01 01 min(A1,1-A1)
01 01 10 min(C2,1-A1)
01 01 11 min(A2,1-A1)
01 10 00 min(C1,1-C2)
01 10 01 min(A1,1-C2)
01 10 10 min(C2,1-C2)
01 10 11 min(A2,1-C2)
01 11 00 min(C1,1-A2)
01 11 01 min(A1,1-A2)
01 11 10 min(C2,1-A2)
01 11 11 min(A2,1-A2)
10 00 00 max(C1,1-C1)
10 00 01 max(A1,1-C1)
10 00 10 max(C2,1-C1)
10 00 11 max(A2,1-C1)
10 01 00 max(C1,1-A1)
10 01 01 max(A1,1-A1)
10 01 10 max(C2,1-A1)
10 01 11 max(A2,1-A1)
10 10 00 max(C1,1-C2)
10 10 01 max(A1,1-C2)
10 10 10 max(C2,1-C2)
10 10 11 max(A2,1-C2)
10 11 00 max(C1,1-A2)
10 11 01 max(A1,1-A2)
10 11 10 max(C2,1-A2)
10 11 11 max(A2,1-A2)
11 00 00 undefined
11 00 01 1-C1 (use 11 00 11)
11 00 10 undefined
11 00 11 1-C1 (preferred)
11 01 00 C1 (use 11 11 00)
11 01 01 undefined
11 01 10 C2 (use 11 11 10)
11 01 11 undefined
11 10 00 undefined
11 10 01 1-C2 (use 11 10 11)
11 10 10 undefined
11 10 11 1-C2 (preferred)
11 11 00 C1 (preferred)
11 11 01 undefined
11 11 10 C2 (preferred)
11 11 11 undefined -> one

DirectFB Example


Putting these together into the proper constants, the blending equations can be built for different APIs.  Here is how DirectFB would be mapped:

For DirectFB, the SetSrcBlendFunction() and SetDstBlendFunction() can specify 121 combinations of blends (11 x 11). It's impractical to specify these combinations individually. Instead, the settings indicated by each call should be bitwise OR'd to make the proper single value used in BLTsville.

  32-bit Binary Value
SetSrcBlendFunction() [VendorID]  [--K1--]  [--K2--]  [--K3--]  [--K4--]
DSBF_ZERO 0000 0000 00 00 00 xx xx xx 00 00 00 xx xx xx
DSBF_ONE 0000 0000 11 11 11 xx xx xx 11 11 11 xx xx xx
DSBF_SRCCOLOR 0000 0000 11 11 00 xx xx xx 00 00 01 xx xx xx
DSBF_INVSRCCOLOR 0000 0000 11 00 11 xx xx xx 00 01 00 xx xx xx
DSBF_SRCALPHA 0000 0000 00 00 01 xx xx xx 00 00 01 xx xx xx
DSBF_INVSRCALPHA 0000 0000 00 01 00 xx xx xx 00 01 00 xx xx xx
DSBF_DESTCOLOR 0000 0000 11 11 10 xx xx xx 00 00 11 xx xx xx
DSBF_INVDESTCOLOR 0000 0000 11 10 11 xx xx xx 00 11 00 xx xx xx
DSBF_DESTALPHA 0000 0000 00 00 11 xx xx xx 00 00 11 xx xx xx
DSBF_INVDESTALPHA 0000 0000 00 11 00 xx xx xx 00 11 00 xx xx xx
DSBF_SRCALPHASAT 0000 0000 01 11 01 xx xx xx 11 11 11 xx xx xx

  32-bit Binary Value
SetDstBlendFunction() [VendorID]  [--K1--]  [--K2--]  [--K3--]  [--K4--]
DSBF_ZERO 0000 0000 xx xx xx 00 00 00 xx xx xx 00 00 00
DSBF_ONE 0000 0000 xx xx xx 11 11 11 xx xx xx 11 11 11
DSBF_SRCCOLOR 0000 0000 xx xx xx 11 11 00 xx xx xx 00 00 01
etc.          

Porter-Duff

For Porter-Duff blends, the equations can be more specifically defined. For convenience, these are enumerated in the bvblend.h header. These enumerations utilize only the local alpha in the equations as indicated. To use global or remote alpha, these enumerations need to be modified. For example, to include the global alpha in the Porter-Duff BVBLEND_SRC1OVER blend, the blend could be defined like this:

params.op.blend = BVBLEND_SRC1OVER +
BVBLENDDEF_GLOBAL_UCHAR;

To include the remote alpha, the blend could be defined like this:

params.op.blend = BVBLEND_SRC1OVER +
BVBLENDDEF_REMOTE;

And to include both:

params.op.blend = BVBLEND_SRC1OVER +
BVBLENDDEF_GLOBAL_UCHAR +
BVBLENDDEF_REMOTE;

Note that if the source color formats include local alphas, the local alphas, global alpha, and remote alpha will be used together.

Note also that the equations assume the surfaces are premultiplied. So if the surface formats indicate that they are not premultiplied, the alpha multiplication of each color is done prior to using the surface values in the equations.

For example, BVBLEND_SRC1OVER specifies the equations:
Cd = C1 + (1 - A1)C2
Ad = A1 + (1 - A1)A2

If the format of surface 1 is non-premultiplied, the equations are modified to include the multiplication explicitly:

Cd = A1C1 + (1 - A1)C2
Ad = A1 + (1 - A1)A2

Likewise, if the format of surface 2 is non-premultiplied, the equations are modified for this:

Cd = C1 + (1 - A1)A2C2
Ad = A1 + (1 - A1)A2

When including global or remote alphas, these values are used to modify the source 1 value values before being used in the blend equation:

C1 = AgC1
A1 = AgA1
-or- C1 = ArC1
A1 = ArA1
-or- C1 = ArAgC1
A1 = ArAgA1

2. BVBLENDDEF_FORMAT_ESSENTIAL

The essential blending equations are based on the blending equations in common image manipulation programs.
BVBLEND_LIGHTEN      max(src1, src2)
BVBLEND_DARKEN       min(src1, src2)
BVBLEND_MULTIPLY     (src1 * src2) / 255
BVBLEND_AVERAGE      (src1 + src2) / 2
BVBLEND_ADD          src1 + src2 (saturated)
BVBLEND_SUBTRACT     src1 + src2 - 255 (saturated)
BVBLEND_DIFFERENCE   abs(src - src2)
BVBLEND_NEGATION     255 - abs(255 - src1 - src2)
BVBLEND_SCREEN       255 - (((255 - src1) * (255 - src2)) / 256)
BVBLEND_EXCLUSION    src1 + src2 - ((2 * src1 * src2) / 255)
BVBLEND_OVERLAY      (src2 < 128) ? (2 * src1 * src2 / 255) : (255 - 2 * (255 - src1) * (255 - src2) / 255)
BVBLEND_SOFT_LIGHT   (src2 < 128) ? (2 * ((src1 >> 1) + 64)) * ((float)src2 / 255) : (255 - (2 * (255 - ((src1 >> 1) + 64)) * (float)(255 - src2) / 255))
BVBLEND_HARD_LIGHT   (src1 < 128) ? (2 * src2 * src1 / 255) : (255 - 2 * (255 - src2) * (255 - src1) / 255)
BVBLEND_COLOR_DODGE  (src2 == 255) ? src2 : min(255, ((src1 << 8) / (255 - src2))
BVBLEND_COLOR_BURN   (src2 == 0) ? src2 : max(0, (255 - ((255 - src1) << 8 ) / src2))))
BVBLEND_LINEAR_DODGE same as BVBLEND_ADD
BVBLEND_LINEAR_BURN  same as BVBLEND_SUBTRACT
BVBLEND_LINEAR_LIGHT (src2 < 128) ? LINEAR_BURN(src1,(2 * src2)) : LINEAR_DODGE(src1,(2 * (src2 - 128)))
BVBLEND_VIVID_LIGHT  (src2 < 128) ? COLOR_BURN(src1,(2 * src2)) : COLOR_DODGE(src1,(2 * (src2 - 128))))
BVBLEND_PIN_LIGHT    (src2 < 128) ? DARKEN(src1,(2 * src2)) : LIGHTEN(src1,(2 * (src2 - 128)))
BVBLEND_HARD_MIX     (VIVID_LIGHT(src1, src2) < 128) ? 0 : 255
BVBLEND_REFLECT      (src2 == 255) ? src2 : min(255, (src1 * src1 / (255 - src2)))
BVBLEND_GLOW         (src1 == 255) ? src1 : min(255, (src2 * src2 / (255 - src1)))
BVBLEND_PHOENIX      min(src1, src2) - max(src1, src2) + 255)
BVBLEND_ALPHA        alf * src1 + (1 - alf) * src2)
bvbltparams.op.filter

struct bvfilter *filter; /* input */

When BVFLAG_FILTER is set in the bvbltparams.flags member, the bvbltparams.op union is treated as a filter.

To specify the filter, the client fills in filter with one of the bvfilter values.

These values will be extended as general filter types are requested.

bvbltparams.colorkey

void *colorkey; /* input */

When either BVFLAG_KEY_SRC or BVFLAG_KEY_DST is set in the bvbltparams.flags member, colorkey points to a single pixel used as the color key.

The format of this pixel matches the surface being keyed.  i.e. src1geom.format is the format of the color key if BVFLAG_KEY_SRC is set, or dst.format is the format of the color key if BVFLAG_KEY_DST is set.

Subsampled formats do not currently support color keying.

bvbltparams.globalalpha

union bvalpha globalalpha; /* input */

When BVFLAG_BLEND is set in the bvbltparams.flags, and when the blend chosen requires it, globalalpha is used to provide an alpha blending value for the entire operation.  The type is also dependent on the blend chosen.

For the BVBLENDDEF_FORMAT_CLASSIC blend types, if the BVBLENDDEF_GLOBAL_MASK field is not 0, this field is used.  Currently BVBLENDDEF_FORMAT_CLASSIC provides for an 8-bit (unsigned character / byte) format designated by BVBLENDDEF_GLOBAL_UCHAR as well as a 32-bit floating point format designated by BVBLENDDEF_GLOBAL_FLOAT.

bvbltparams.scalemode

enum bvscalemode scalemode; /* input/output */

This member allows the client to specify the type of scaling to be used.  The enumeration begins with 8 bits indicating the vendor.  The remaining bits are defined by the vendor.  BVSCALEDEF_VENDOR_ALL and BVSCALEDEF_VENDOR_GENERAL are shared by all implementations.

BVSCALEDEF_VENDOR_ALL can be used to specify an implicit scale type.  This type is converted to an explicit type by the implementation:

BVSCALE_FASTEST The fastest method of scaling available is used.  This may include nearest neighbor.  The value of this enumeration is purposely 0, and is the default scale type.  No implementation will return an error for this setting.
BVSCALE_FASTEST_NOT_NEAREST_NEIGHBOR The fastest method of scaling available that is not nearest neighbor is used.  This may include an alternative point sample technique.
BVSCALE_FASTEST_POINT_SAMPLE The fastest method of scaling using a point sample technique.
BVSCALE_FASTEST_INTERPOLATED The fastest method of scaling using an interpolation technique.
BVSCALE_FASTEST_PHOTO The fastest method of scaling appropriate for photographs is used.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_FASTEST_DRAWING The fastest method of scaling appropriate for drawings is used.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_GOOD A scaling technique is chosen that may be higher quality than the BVSCALE_FASTEST choice.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_GOOD_POINT_SAMPLE A point sample scaling technique is chosen that may be higher quality than the BVSCALE_FASTEST_POINT_SAMPLE choice.  This may include nearest neighbor.
BVSCALE_GOOD_INTERPOLATED An interpolated scaling technique is chosen that may be higher quality than the BVSCALE_FASTEST_INTERPOLATED choice.
BVSCALE_GOOD_PHOTO A scaling technique appropriate for photographs is chosen that may be higher quality than the BVSCALE_FASTEST_PHOTO choice.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_GOOD_DRAWING A scaling technique appropriate for drawings is chosen that may be higher quality than the BVSCALE_FASTEST_DRAWING choice.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_BETTER A scaling technique is chosen that may be higher quality than the BVSCALE_GOOD choice.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_BETTER_POINT_SAMPLE A point sample scaling technique is chosen that may be higher quality than the BVSCALE_GOOD_POINT_SAMPLE choice.  This may include nearest neighbor.
BVSCALE_BETTER_INTERPOLATED An interpolated scaling technique is chosen that may be higher quality than the BVSCALE_GOOD_INTERPOLATED choice.
BVSCALE_BETTER_PHOTO A scaling technique appropriate for photographs is chosen that may be higher quality than the BVSCALE_GOOD_PHOTO choice.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_BETTER_DRAWING A scaling technique appropriate for drawings is chosen that may be higher quality than the BVSCALE_GOOD_DRAWING choice.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_BEST The highest quality scaling technique is chosen.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_BEST_POINT_SAMPLE The highest quality point sample technique is chosen.
BVSCALE_BEST_INTERPOLATED The highest quality interpolated scaling technique is chosen.
BVSCALE_BEST_PHOTO The highest quality scaling technique appropriate for photographs is chosen.  This may include nearest neighbor.  No implementation will return an error for this setting.
BVSCALE_BEST_DRAWING The highest quality scaling technique appropriate for drawings is chosen.  This may include nearest neighbor.  No implementation will return an error for this setting.

BVSCALEDEF_VENDOR_GENERAL can be used to specify one of the shared explicit scale types.  At this point, only a limited number of explicit scale types are defined:

BVSCALE_NEAREST_NEIGHBOR This is a point sample scaling technique where the resampled destination pixel is set to the value of the closest source pixel.
BVSCALE_BILINEAR This is an interpolated scaling technique where the resampled destination pixel is set to a value linearly interpolated in two dimensions from the four closest source pixels.
BVSCALE_BICUBIC This is an interpolated scaling technique where the resampled destination pixel is set to a value calculated using cubic interpolation in two dimensions.
BVSCALE_3x3_TAP  
BVSCALE_5x5_TAP  
BVSCALE_7x7_TAP  
BVSCALE_9x9_TAP  

If the client wants to know the explicit type chosen by a given implementation, it can set BVFLAG_SCALE_RETURN in the bvbltparams.flags member, and the explicit scale type is returned in the scalemode member.

NOTE:  Extending the BVSCALEDEF_VENDOR_GENERAL scale types or obtaining a vendor ID can be accomplished by submitting a patch.

bvbltparams.dithermode

enum bvdithermode dithermode; /* input/output */

This member allows the client to specify the type of dithering to be used, when the output format has fewer bits of depth than the internal calculation.  The enumeration begins with 8 bits indicating the vendor.  The remaining bits are defined by the vendor.  BVDITHERDEF_VENDOR_ALL and BVDITHERDEF_VENDOR_GENERAL are shared by all implementations.

BVDITHERDEF_VENDOR_ALL can be used to specify an implicit dither type.  This type is converted to an explicit type by the implementation:

BVDITHER_FASTEST The fastest method of dithering available is used.  This may include no dithering (truncation).  The value of this enumeration is purposely 0, and is the default dither type.  No implementation will return an error for this setting.
BVDITHER_FASTEST_ON The fastest method of dithering available is used.  This will not include no dithering.
BVDITHER_FASTEST_RANDOM The fastest method of dithering using a random technique.
BVDITHER_FASTEST_ORDERED The fastest method of dithering using an ordered diffusion technique.
BVDITHER_FASTEST_DIFFUSED The fastest method of dithering using an error diffusion technique.
BVDITHER_FASTEST_PHOTO The fastest method of dithering appropriate for photographs is used.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_FASTEST_DRAWING The fastest method of dithering appropriate for drawings is used.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_GOOD A dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST choice.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_GOOD_ON Any dithering technique available is used.  This will not include no dithering.  This may be higher quality than BVDITHER_FASTEST_ON.
BVDITHER_GOOD_RANDOM A random dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST_RANDOM choice.
BVDITHER_GOOD_ORDERED An ordered dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST_ORDERED choice.
BVDITHER_GOOD_DIFFUSED A diffused dithering technique is chosen that may be higher quality than the BVDITHER_FASTEST_DIFFUSED choice.
BVDITHER_GOOD_PHOTO A dithering technique appropriate for photographs is chosen that may be higher quality than the BVDITHER_FASTEST_PHOTO choice.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_GOOD_DRAWING A dithering technique appropriate for drawings is chosen that may be higher quality than the BVDITHER_FASTEST_DRAWING choice.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_BETTER A dithering technique is chosen that may be higher quality than the BVDITHER_GOOD choice.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_BETTER_ON Any dithering technique available is used.  This will not include no dithering.  This may be higher quality than BVDITHER_GOOD_ON.
BVDITHER_BETTER_RANDOM A random dithering technique is chosen that may be higher quality than the BVDITHER_GOOD_RANDOM choice.
BVDITHER_BETTER_ORDERED An ordered dithering technique is chosen that may be higher quality than the BVDITHER_GOOD_ORDERED choice.
BVDITHER_BETTER_DIFFUSED A diffused dithering technique is chosen that may be higher quality than the BVDITHER_GOOD_DIFFUSED choice.
BVDITHER_BETTER_PHOTO A scaling technique appropriate for photographs is chosen that may be higher quality than the BVSCALE_GOOD_PHOTO choice.  No implementation will return an error for this setting.
BVDITHER_BETTER_DRAWING A scaling technique appropriate for drawings is chosen that may be higher quality than the BVSCALE_GOOD_DRAWING choice.  No implementation will return an error for this setting.
BVDITHER_BEST The highest quality dithering technique is chosen.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_BEST_ON Any dithering technique available is used.  This will not include no dithering.  This may be higher quality than BVDITHER_BEST_ON.
BVDITHER_BEST_RANDOM The highest quality random dithering technique is chosen.
BVDITHER_BEST_ORDERED The highest quality ordered dithering technique is chosen.
BVDITHER_BEST_DIFFUSED The highest quality diffused dithering technique is chosen.
BVDITHER_BEST_PHOTO The highest quality dithering technique appropriate for photographs is chosen.  This may include no dithering.  No implementation will return an error for this setting.
BVDITHER_BEST_DRAWING The highest quality dithering technique appropriate for drawings is chosen.  This may include no dithering.  No implementation will return an error for this setting.

BVDITHERDEF_VENDOR_GENERAL can be used to specify one of the shared explicit dithering types.  At this point, only a limited number of explicit dither types are defined:

BVDITHER_NONE No dithering is performed.  Internal pixel component values are truncated to the destination component bit depth.
BVDITHER_ORDERED_2x2  
BVDITHER_ORDERED_4x4  
BVDITHER_ORDERED_2x2_4x4 2x2 ordered dither is used for components with the lowest bit reduction.  4x4 ordered dither is used for the components with the highest bit reduction.  (E.g. RGB24 to RGB565 will use 2x2 ordered dither for the green component and 4x4 ordered dither for the red and blue components.)

If the client wants to know the explicit type chosen by a given implementation, it can set BVFLAG_DITHER_RETURN in the bvbltparams.flags member, and the explicit scale type is returned in the dithermode member.

NOTE:  Extending the BVDITHERDEF_VENDOR_GENERAL scale types or obtaining a vendor ID can be accomplished by submitting a patch.

bvbltparams.dstdesc

struct bvbuffdesc *dstdesc;

dstdesc is used to specify the destination buffer.  If the buffer has not been mapped with a call to bv_map(), bv_blt() will map the buffer as necessary to perform the BLT and then unmap afterwards.  See bvbuffdesc for details.

bvbltparams.dstgeom

struct bvsurfgeom *dstgeom;

dstgeom is used to specify the geometry of the surface contained in the destination buffer.  See bvsurfgeom for details.

bvbltparams.dstrect

struct bvrect dstrect;

dstrect is used to specify the destination rectangle to receive the BLT.  This rectangle is clipped by bvbltparams.cliprect when BVFLAG_CLIP is set in the bvbltparams.flags member.

bvbltparams.src1/src2/mask.desc

struct bvbuffdesc *src1.desc;
struct bvbuffdesc *src2.desc;
struct bvbuffdesc *mask.desc;

These members are used to identify the buffer for the source1, source2, and mask surfaces when the associated BVFLAG_TILE_* flag is not set.  The buffer is the memory in which the surface lies.  See the bvbltparams.src1/src2/maskgeom for the format and layout/geometry of the surface.

NOTE WELL:  Clients should never change the value of a bvbuffdesc structure while a buffer is mapped.

bvbltparams.src1/src2/mask.tileparams

struct bvtileparams *src1.tileparams;
struct bvtileparams *src2.tileparams;
struct bvtileparams *mask.tileparams;

These members are used to identify the buffer for the source1, source2, and mask surfaces when the associated BVFLAG_TILE_* flag is set.  The buffer is the memory in which the surface lies.  This differs from the src1/src2/mask.desc identity by providing more information needed for tiling and by not requiring mapping (for hardware implementations that support tiling, the tile data is usually moved into an on-chip cache).

bvbltparams.src1/src2/maskgeom

struct bvsurfgeom src1geom;
struct bvsurfgeom src2geom;
struct bvsurfgeom maskgeom;

These members describe the format and layout/geometry of their respective surfaces.  Separating bvsurfgeom from the bvbuffdesc allows easy use of buffers for multiple geometries without remapping.  See bvsurfgeom and bvbuffdesc for details.

bbvbltparams.src1/src2/maskrect

struct bvrect src1rect;
struct bvrect src2rect;
struct bvrect maskrect;

These members specify the rectangle from which data is read for the BLT.  These rectangles are clipped by a scaled version of the bvbltparams.cliprect  (scaling is based on the relationship between them and the bvbltparams.dstrect) when BVFLAG_CLIP is set in the bvbltparams.flags member.

Example:

src1rect = (0, 0) - (400 x 200)
dstrect = (0, 0) - (800 x 600)
cliprect = (10, 30) - (300 x 300)

The scaling ratio of the dstrect to the src1rect is (800/400,  600/300) or (2, 3).  Using this, the effective source 1 clipping rectangle becomes (10/2, 30/3) - (300/2 x 300/3) or (5, 10) - (150 x 100).

This approach allows fractional clipping at the source using a method which is simpler to implement than fractional coordinates.

NOTE:  In BLTsville, reading outside the source rectangle is forbidden.  So scaling algorithms which require pixels around a particular source pixel must utilize boundary techniques (mirror, repeat, clamp, etc.) at the edges of the source rectangle.  However, if the clipping rectangle, when translated back to the source rectangle, leaves space between it and the source rectangle, pixels outside the clipped region may be accessed by the implementation.

bvbltparams.cliprect

struct bvrect cliprect;

cliprect is used to specify a rectangle that limits what region of the destination is written.  This is most useful for scaling operations, where the necessary scaling factor will not allow translation of the destination rectangle back to the source on an integer pixel boundary.

NOTE:  If cliprect exceeds the destination surface, the behavior is undefined.

For example, if the goal is to show a 640 x 480 video on a 1920 x 1080 screen, the video would be stretched to 1440 x 1080 to maintain the proper aspect ratio.  So the relevant rectangles would be:

src1rect dstrect
(0, 0) - 640 x 480 (240, 0) - 1440 x 1080

However, to handle a 640 x 480 pop-up window that appears centered on the screen, in front of the video, the single BLT may be broken into four smaller BLTs pieced around the popup.  These rectangles would need to be:

src1rect dstrect
(0, 0) - 640 x 133.333... (240, 0) - 1440 x 300
(0, 133.333...) - 284.444... x 213.333... (240, 300) - 400 x 480
(568.888..., 133.333...) - 284.444... x 213.333... (1280, 300) - 400 x 480
(0, 346.666...) - 640 x 133.333... (240, 780) - 1440 x 300

Since this is a scaling factor of 2.25x, translating the required destination rectangles back to the source results in non-integer coordinates and dimensions, as illustrated above.  And adjusting the source rectangles to the nearest integer values will result in visible discontinuities at the boundaries between the rectangles.

Instead, using the cliprect, this situation can be handled more easily:

src1rect dstrect cliprect
(0, 0) - 640 x 480 (240, 0) - 1440 x 1080 (240, 0) - 1440 x 300
(0, 0) - 640 x 480 (240, 0) - 1440 x 1080 (240, 300) - 400 x 480
(0, 0) - 640 x 480 (240, 0) - 1440 x 1080 (1280, 300) - 400 x 480
(0, 0) - 640 x 480 (240, 0) - 1440 x 1080 (240, 780) - 1440 x 300

bvbltparams.batchflags

unsigned long batchflags;

batchflags are used by the client as a hint to indicate to the implementation which parameters are changing between successive BLTs of a batch.  The flags may be used when the bvbltparams.flags has BVFLAG_BATCH_CONTINUE or BVFLAG_BATCH_END set.

BVBATCH_OP indicates that the operation type (BVFLAG_ROP, BVFLAG_BLEND, BVFLAG_FILTER, etc.) has changed.
BVBATCH_KEY indicates that the bvbltparams.colorkey or the color key mode (BVFLAG_KEY_SRC/BVFLAG_KEY_DST) has changed.
BVBATCH_MISCFLAGS indicates that bvbltparams.flags other than the operation, color key, or clip flag have changes.
BVBATCH_ALPHA indicates that bvbltparams.globalalpha or global alpha type has changed.
BVBATCH_DITHER indicates that bvbltparams.dithermode has changed.
BVBATCH_SCALE indicates that bvbltparams.scalemode has changed.
BVBATCH_DST indicates that the destination surface (bvbltparams.dstdesc, bvbltparams.dstgeom ,or bvbltparams.dstrect) has changed.
BVBATCH_SRC1 indicates that the source 1 surface (bvbltparams.src1.desc or bvbltparams.src1.tileparams, or bvbltparams.src1geom) has changed.
BVBATCH_SRC2 indicates that the source 2 surface (bvbltparams.src2.desc or bvbltparams.src2.tileparams, or bvbltparams.src2geom) has changed.
BVBATCH_MASK indicates that the mask surface (bvbltparams.mask.desc or bvbltparams.mask.tileparams, or bvbltparams.maskgeom) has changed.
BVBATCH_DSTRECT_ORIGIN indicates that bvbltparams.dstrect.left or top has changed.
BVBATCH_DSTRECT_SIZE indicates that the bvbltparams.dstrect.width or height has changed.
BVBATCH_SRC1RECT_ORIGIN indicates that bvbltparams.src1rect.left or top has changed.
BVBATCH_SRC1RECT_SIZE indicates that the bvbltparams.src1rect.width or height has changed.
BVBATCH_SRC2RECT_ORIGIN indicates that bvbltparams.src2rect.left or top has changed.
BVBATCH_SRC2RECT_SIZE indicates that the bvbltparams.src2rect.width or height has changed.
BVBATCH_MASKRECT_ORIGIN indicates that bvbltparams.maskrect.left or top has changed.
BVBATCH_MASKRECT_SIZE indicates that the bvbltparams.maskrect.width or height has changed.
BVBATCH_CLIPRECT_ORIGIN indicates that bvbltparams.cliprect.left or top has changed.
BVBATCH_CLIPRECT_SIZE indicates that the bvbltparams.cliprect.width or height has changed.
BVBATCH_TILE_SRC1 indicates that the bvbltparams.src1.tileparams has changed.
BVBATCH_TILE_SRC2 indicates that the bvbltparams.src2.tileparams has changed.
BVBATCH_TILE_MASK indicates that the bvbltparams.mask.tileparams has changed.
BVBATCH_ENDNOP is a special flag used with BVFLAG_BATCH_END, for clients that do not have information that a batch is ending until after the last BLT has been issued.  When this flag is set, no BLT is done, but the batch is ended.

NOTE:  These flags are hints, and may be used or not by a BLTsville implementation.  So if bvbltparams members are changed between BLTs in a batch, but the bvbltparams.batchflags member is not correctly updated, the resulting behavior on different implementations will not be consistent.

bvbltparams.batch

struct bvbatch *batch;

This member is used as a batch handle, so that multiple batches can be under construction at the same time.

bvbltparams.callbackfn

void (*callbackfn)(struct bvcallbackerror *err, unsigned long callbackdata);

This member is a pointer to a client-supplied function which is called by the implementation when BVFLAG_ASYNC is set and the BLT is complete.  If this member is NULL, no callback is performed.  When there is no error, the err parameter will be set to 0;

NOTE:  This function can be called before the bv_blt() call has returned.

bvbltparams.callbackdata

unsigned long callbackdata;

This member is used as the parameter passed back by the bvbltparams.callbackfn.  This can be anything from an identifying index to a pointer used by the client.

bvbltparams.src2/maskauxdstrect

struct bvrect src2auxdstrect;
struct bvrect maskauxdstrect;

These two members are used only when the associated BVFLAG_SRC2/MASK_AUXDSTRECT flags are set.  They are only necessary (and should only be used) in the case where scaling of the inputs differs and the entire source images are not being used.  bvbltparams.dstrect is always used to specify the destination of source 1 image.  When the associated flags are set, these two members are used to specify the destination of the source 2 and mask images, instead of bvbltparams.dstrect.

These flags must be used with the BVFLAG_CLIP flag.  And if the resulting clipped destination does not include all enabled destination rectangles, the results are undefined.

Example:  We have two images that we want to merge and view on an 854x480 LCD panel.  One image is a small background image with 16:9 (64x36) aspect ratio that we want to stretch to fill the screen.  The other is a standard definition 720x480 (4:3 aspect ratio) image with transparency we want to blend on top of our background.
(shown actual size)
(shown 1/2x; not adjusted for aspect ratio)
We want to blend the second image onto the center of the first, scaling both, so that it looks like this:
(shown 1/2x)
The screen is effectively a 16:9 aspect ratio (we can ignore the fraction of a pixel here), which matches our background image.  So the background image just needs to be scaled from 64x36 to 854x480.

However, since the second image has a 4:3 aspect ratio, it will not cover the entire background image if we want to maintain its aspect ratio.  Our second image is not as wide as our 16:9 image, which means it's height will match the screen height, but the width will be smaller.  Since the screen is 480 lines (pixels) high, to maintain our 4:3 aspect ratio, our second image will need to be 640 pixels wide (4 * 480 / 3).  So it will need to be scaled from 720x480 to 640x480.

As we mentioned, we would like to center the 640 pixel image on our 854 pixel wide screen.  That means the left edge of the image will be at pixel 107 ( (854 - 640) / 2 ).  So the leftmost 107 columns of pixels will just be a copy of the left portion of the background image.  Likewise, the rightmost 107 columns will be a copy of the right portion of the background image.  Only the middle section should be blended.
(shown 1/2x)
The side two BLTs are quite easy with BLTsville, by using the clipping rectangle:

bvbltparams.flags = BVFLAG_ROP | BVFLAG_CLIP;
bvbltparams.op.rop = 0xCCCC;

bvbltparams.src1.desc = bkgnddesc;
bvbltparams.src1geom = bkgndgeom;
bvbltparams.src1rect.left = 0;
bvbltparams.src1rect.top = 0;
bvbltparams.src1width = 64;
bvbltparams.src1height = 36;

bvbltparams.dstdesc = screendesc;
bvbltparams.dstgeom = screengeom;
bvbltparams.dstrect.left = 0;
bvbltparams.dstrect.top = 0;
bvbltparams.dstrect.width = 854;
bvbltparams.dstrect.height = 480;

bvbltparams.cliprect.left = 0;
bvbltparams.cliprect.top = 0;
bvbltparams.cliprect.width = 107;
bvbltparams.cliprect.height = 480;
bv_blt(&bvbltparams);

bvbltparams.cliprect.left += 640;
bv_blt(&bvbltparams);

However, if we try the same approach with the middle BLT, we run into problems:

bvbltparams.flags = BVFLAG_BLEND | BVFLAG_CLIP;
bvbltparams.op.blend = BVBLEND_SRC1OVER;

bvbltparams.src1.desc = foregnddesc;
bvbltparams.src1geom = foregndgeom;
bvbltparams.src1rect.left = 0;
bvbltparams.src1rect.top = 0;
bvbltparams.src1rect.width = 720;
bvbltparams.src1rect.height = 480;

bvbltparams.src2.desc = bkgnddesc;
bvbltparams.src2geom = bkgndgeom;
bvbltparams.src2rect.left = 0;
bvbltparams.src2rect.top = 0;
bvbltparams.src2width = 64;
bvbltparams.src2height = 36;

bvbltparams.cliprect.left = 107;
bvbltparams.cliprect.top = 0;
bvbltparams.cliprect.width = 640;
bvbltparams.cliprect.height = 480;
bv_blt(&bvbltparams);
(shown 1/2x)
The result is that the foreground image is stretched horizontally.  That's because the scaling factor is derived from the source (1) rectangle and the destination rectangle, which is the full width of the screen.  Since we were also scaling the background, we set the destination rectangle to cover the screen, as we did in the previous two BLTs.

The edges of our foreground image are also cropped, since we were only modifying the middle of the screen.

What if we change the destination rectangle?

bvbltparams.dstrect.left = 107;
bvbltparams.dstrect.top = 0;
bvbltparams.dstrect.width = 640;
bvbltparams.dstrect.height = 480;

bv_blt(&bvbltparams);
(shown 1/2x)
Here we get the proper scaling of the foreground image, but the background image is scaled improperly.

What if we adjust the source rectangles?  For our purposes, we want all of the foreground image, but we only need the middle of the background image.  So we can manually specify the middle of the background image by modifying the source 2 rectangle:

bvbltparams.src2rect.left = 107 * 64 / 854;
bvbltparams.src2rect.width = 640 * 64 / 854;

Nice, but what are those values?

107 * 1280 / 854 = 8.0187...
640 * 1280 / 854 = 47.9625...

In BLTsville, all rectangle parameters are expressed in integers (this also allows BLTsville to be used in the kernels where floating point variables are not allowed).  The clipping rectangle then handles introducing the necessary source pixel subdivision (by translating the clipping rectangle back to the source rectangle in the implementation).  So what happens if we actually do use these values as integers?

bvbltparams.src2rect.left = 8;
bvbltparams.src2rect.top = 0;
bvbltparams.src2rect.width = 47;
bvbltparams.src2height = 36;

bv_blt(&bvbltparams);

And this is what we get:
(shown 1/2x)
Closer, but not quite.  Rounding the values above to integers still results in visible errors at the boundaries between the middle and the side BLTs (the one on the right is a bit more visible at this reduced size, but if you view the full image, you'll see the left one as well), because the left edge and scaling (and right edge as a result) don't match the alignment and scaling done for the BLTs on the side. 

NOTE:  This artifact is not always obvious in still images.  The images here were chosen to make the artifacts obvious in this documentation.  But even if the static images appear correct, movement of the images (e.g. moving the foreground image across the background image) or changes in the blending (e.g. fading the foreground image out and finally removing it), will show these less obvious discrepancies.

This is actually what the clipping rectangle is for.  It's meant to allow us to always specify the source and destination rectangles the same, but move the clipping window around on the destination to get just the pixels we want.  That way the scaling and alignment area always the same.  Unfortunately, for this special case, we really need a way to specify different scaling factors for the different inputs.  The src2auxdstrect (and maskauxdstrect, when needed) have been added to provide this capability.

Here is how this set of BLTs can be done:

bvbltparams.flags = BVFLAG_ROP | BVFLAG_CLIP;
bvbltparams.op.rop = 0xCCCC;

bvbltparams.src1.desc = bkgnddesc;
bvbltparams.src1geom = bkgndgeom;
bvbltparams.src1rect.left = 0;
bvbltparams.src1rect.top = 0;
bvbltparams.src1width = 64;
bvbltparams.src1height = 36;

bvbltparams.dstdesc = screendesc;
bvbltparams.dstgeom = screengeom;
bvbltparams.dstrect.left = 0;
bvbltparams.dstrect.top = 0;
bvbltparams.dstrect.width = 854;
bvbltparams.dstrect.height = 480;

bvbltparams.cliprect.left = 0;
bvbltparams.cliprect.top = 0;
bvbltparams.cliprect.width = 107;
bvbltparams.cliprect.height = 480;
bv_blt(&bvbltparams);

bvbltparams.cliprect.left += 640;
bv_blt(&bvbltparams);

bvbltparams.flags = BVFLAG_BLEND | BVFLAG_CLIP | BVFLAG_SRC2_AUXDSTRECT;
bvbltparams.op.blend = BVBLEND_SRC1OVER;

bvbltparams.src1.desc = foregnddesc;
bvbltparams.src1geom = foregndgeom;
bvbltparams.src1rect.left = 0;
bvbltparams.src1rect.top = 0;
bvbltparams.src1rect.width = 720;
bvbltparams.src1rect.height = 480;

bvbltparams.dstrect.left = 107;
bvbltparams.dstrect.top = 0;
bvbltparams.dstrect.width = 640;
bvbltparams.dstrect.height = 480;

bvbltparams.src2.desc = bkgnddesc;
bvbltparams.src2geom = bkgndgeom;
bvbltparams.src2rect.left = 0;
bvbltparams.src2rect.top = 0;
bvbltparams.src2width = 64;
bvbltparams.src2height = 36;

bvbltparams.src2auxdstrect.left = 0;
bvbltparams.src2auxdstrect.top = 0;
bvbltparams.src2auxdstrect.width = 854;
bvbltparams.src2auxdstrect.height = 480;

bvbltparams.cliprect.left = 107;
bvbltparams.cliprect.top = 0;
bvbltparams.cliprect.width = 640;
bvbltparams.cliprect.height = 480;
bv_blt(&bvbltparams);

Using this approach, we get the desired output:
(shown 1/2x)
It may also be clear that in that last BLT, the clip rectangle isn't really necessary.  This is good, because it frees up the clipping rectangle to be used to further subdivide the image if necessary (e.g. if partially occluded).


bvrect

struct bvrect {
    int left;
    int top;
    unsigned int width;
    unsigned int height;
};

bvrect.left

int left;

This member indicates the left edge of the rectangle, measured in pixels from the left edge of the surface.  Note that this value can be negative, indicating that the rectangle begins before the left edge of the surface.  However, this is only allowed when a rectangle is clipped to the surface.  If, after clipping, the left edge of the rectangle is still negative, this is an error.

bvrect.top

int top;

This member indicates the top edge of the rectangle, measured in lines of bvbuffdesc.virtstride bytes from the top edge of the surface.  Note that this value can be negative, indicating that the rectangle begins before the top edge of the surface.  However, this is only allowed when a rectangle is clipped to the surface.  If, after clipping, the top edge of the rectangle is still negative, this is an error.

bvrect.width

unsigned int width;

This member indicates the width of the rectangle, measured in pixels.  Note that this value cannot be negative.  (Horizontal flipping is indicated using the BVFLAG_HORZ_FLIP_* flags.)  The value of this member may exceed the width of the associated surface.  However, this is only allowed when a rectangle is clipped to the surface.  If, after clipping, the right edge of the rectangle still exceeds the width of the surface, this is an error.

bvrect.height

unsigned int height;

This member indicates the height of the rectangle, measured in lines of bvbuffdesc.virtstride bytes.  Note that this value cannot be negative.  (Vertical flipping is indicated using the BVFLAG_VERT_FLIP_* flags.)  The value of this member may exceed the width of the associated surface.  However, this is only allowed when a rectangle is clipped to the surface.  If, after clipping, the right edge of the rectangle still exceeds the height of the surface, this is an error.


bvcopparams

bvcopparams is used to define the cache operation to be performed by bv_cache().

struct bvcopparams {
        unsigned int structsize;
        struct bvbuffdesc *desc;
        struct bvsurfgeom *geom;
        struct bvrect     *rect;
        enum bvcacheop  cacheop;
};

bvcopparams.structsize

unsigned long structsize; /* input */

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bvcopparams.desc

struct bvbuffdesc *desc;

This member points to the bvbuffdesc of the surface for which the cache is being manipulated.  This buffer should have been mapped with a call to bv_map().

NOTE:  Implementations may choose to dynamically map the surface as with bv_blt(), however in many systems, this will not function properly due to dynamic paging which can occur when a surface is not locked.

bvcopparams.geom

struct bvsurfgeom *geom;

This member points to the bvsurfgeom of the surface for which the cache is being manipulated.

bvcopparams.rect

struct bvrect *rect;

This member points to the bvrect describing the rectangle of the surface which is being manipulated.

bvcopparams.cacheop

enum bvcacheop cacheop;

This member specifies the cache operation to be performed.  It is an enumeration from the following list:

BVCACHE_BIDIRECTIONAL (This usually performs a cache flush operation.)
BVCACHE_CPU_TO_DEVICE Performs the appropriate cache operation to ensure data can be transferred correctly when it was written with the CPU, but will be read by the 2-D device.  (This is usually a cache clean operation.)
BVCACHE_CPU_FROM_DEVICE Performs the appropriate cache operation to ensure data can be transferred correctly when it was written by the 2-D device, but will be read by the CPU.  (This is usually a cache invalidate operation.)


bvbuffdesc

This structure is used in conjunction with a bvsurfgeom structure to specify the characteristics of a graphic surface.  This structure specifies the memory buffer itself.

struct bvbuffdesc {
        unsigned int structsize;
        void *virtaddr;
        unsigned long length;
        struct bvbuffmap *map;
        enum bvauxtype auxtype;
        void *auxptr;
};

bvbuffdesc.structsize

unsigned int structsize;

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bvbuffdesc.virtaddr

void *virtaddr;

This member is used to indicate the CPU virtual address of the start of the buffer.  This value must be provided unless the auxtype/auxptr members below are used.  At that time, this member is optional, and the auxptr usually has higher priority than this member.

Implementations Only

Note that this is always the beginning of the buffer.  This means that if the bvsurfgeom.virtstride is negative, or the bvsurfgeom.orientation does not normalize to 0º  (i.e. orientation % 360 != 0), implementations may need to use a modified version of virtaddr internally to operate correctly.

bvbuffdesc.length

unsigned long length;

This member specifies the length of the buffer in bytes.

NOTE:  When used with a bvsurfgeom structure, length should be greater than or equal to bvsurfgeom.height * bvsurfgeom.virtstride.

bvbuffdesc.map

struct bvbuffmap *map;

This member is used by the implementations and should NEVER be manipulated by the client.  When the bvbuffdesc structure is created, this member should be set to 0, indicating that no implementations have mapped the buffer.  After a buffer has been mapped using a call to bv_map(), this member should be left as-is by clients.  (The implementation will set this back to 0 before returning from bv_unmap().)

Implementations Only

This member points to a linked list of bvbuffmap structures associated with the buffer.  Each bvbuffmap is added to the list as the buffer is mapped by a given implementation.  This may be done with an explicit call to bv_map(), or implicitly with a call to bv_blt(), after a call to bv_map() from a different implementation.

Implementations should not assume that the first entry in the list is their bvbuffmap.  Instead, implementations should compare the bv_unmap() pointer in the structure to their own function address.

bvbuffdesc.auxtype

enum bvauxtype auxtype;

This member is used to identify the type of additional information about the buffer provided by auxptr.  Currently no values are defined for the user mode interface, so it should be initialized to 0 or BVAT_NONE.  See the Kernel Mode Interface for details on the values defined for the kernel mode interface.

bvbuffdesc.auxptr

void *auxptr;

This member is used to point to additional information about the buffer.  The type of this pointer is determined by the auxtype value.  Currently there are no types defined for the user mode interface, so this member is ignored.  See the Kernel Mode Interface for details on the types defined for the kernel mode interface.



Implementations Only

bvbuffmap

This structure is used from the bvbuffdesc.map member to allow implementations to associate their own data with a buffer.

struct bvbuffmap {
        unsigned int structsize;
        BVFN_UNMAP bv_unmap;
        unsigned long handle;
        struct bvbuffmap *nextmap;
};

bvbuffmap.structsize

unsigned int structsize;

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bvbuffmap.bv_unmap

BVFN_UNMAP bv_unmap;

This member holds the pointer to the bv_unmap() function of the implementation associated with the bvbuffmap structure.  It serves to allow implementations to identify their bvbuffmap structure in the linked list, as well as to allow implementations to call each other's bv_unmap() calls from their own.

bvbuffmap.handle

unsigned long handle;

This member is used to hold an implementation-specific piece of data.

bvbuffmap.nextmap

struct bvbuffmap *nextmap;

This member holds a pointer to the next bvbuffmap structure in the linked list.  If this member is 0, there are no more entries in the list.

NOTE:  The Linux/Android Kernel Mode Interface differs slightly from this structure.  Refer to the Kernel Mode Interface section for details.



bvsurfgeom

This structure is used in conjunction with a bvbuffdesc structure to specify the characteristics of a graphic surface.  This structure specifies the surface geometric characteristics.

NOTE:  This structure was separated from bvbuffdesc to afford much flexibility to the client.  Using the same bvbuffdesc structure with different bvsurfgeom structures or using the same bvsurfgeom structure with different bvbuffdesc structures may be of benefit.  See the examples at the bottom of this section.

struct bvcopparams {
        unsigned int structsize;
        enum ocdformat format;
        unsigned int width;
        unsigned int height;
        int orientation;
        long virtstride;
        enum ocdformat paletteformat;
        void *palette;
};

bvsurfgeom.structsize

unsigned int structsize;

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bvsurfgeom.format

enum ocdformat format;

This member specifies the format of the surface using the Open Color format Definitions (OCD).

bvsurfgeom.width

unsigned int width;

This member specifies the width of the surface in pixels.  This size does not have to be equivalent to the virtstride size.

Implementations Only

Implementations should never assume that width is equivalent to virtstride.

bvsurfgeom.height

unsigned int height;

This member specifies the height of the surface in lines of virtstride width.

bvsurfgeom.orientation

int orientation;

This member specifies the orientation or angle of the surface in degrees.  Since BLTsville is designed only to specify orthogonal rectangles, this value must be a multiple of 90º.  This value may be negative.  (Extending BLTsville to handle non-orthogonal rectangles may be considered if there is sufficient interest.)

Implementations Only

Implementations should normalize orientation angles.  For example, a client that sets the orientation to -450º should behave as if the value of 270º were specified.

bvsurfgeom.virtstride

long virtstride;

This member specifies the horizontal stride of the surface in bytes for an unrotated surface.  The stride represents the number of bytes needed to move from one pixel to the pixel immediately below it.  This value may be negative.

NOTE:  This means the orientation does not affect the virtstride.  However, rotating a surface usually results in a different configuration (i.e. width), which will affect the virtstride.  For example, a 320 x 240 x 32 bpp 0º surface might have a virtstride of 1280 bytes (320 pixels/line * 32 bits/pixel / 8 bits/byte).  When the orientation is set to 180º, the virtstride would be the same.  But when the orientation is set to 90º (or 270º), the virtstride would most likely need to be set to 960 bytes (240 pixels/line * 32 bits/pixel / 8 bits/byte).

Implementations Only

Implementations that do not support a negative virtstride must compensate using whatever mechanism is appropriate for the implementation.  For example, using a vertical flipping/mirroring setting.

NOTE:  The virtstride name must be maintained for backwards compatibility.  However, no situation should arise where the client would need to provide two different strides for the virtual and physical views of a surface (there are situations where a physical stride will need to be available within the implementation, but the client will not be the one to supply it), so physstride will most likely never be needed.  However, when a client provides a physical description of the buffer (see the Kernel Mode Interface section below), the virtstride entry should be used to provide the physical stride.

bvsurfgeom.paletteformat

enum ocdformat paletteformat;

This member specifies the format of the palette supplied via the palette member for palettized formats using the Open Color format Definitions (OCD).

bvsurfgeom.palette

void *palette;

This member points to a palette used for palettized formats.  The format of the palette is specified by the paletteformat member.  Palettes are packed based on their container size:

Palette Format Palette Layout (byte address) Palette Layout (little endian)
OCDFMT_xRGB12 n/a
*(((unsigned short *)palette) + 0) 0xFrgb
*(((unsigned short *)palette) + 1) 0xFrgb
... ...
*(((unsigned short *)palette) + n - 1) 0xFrgb
OCDFMT_RGB24
*(((unsigned char *)palette) + 0) red0
*(((unsigned char *)palette) + 1) green0
*(((unsigned char *)palette) + 2) blue0
*(((unsigned char *)palette) + 3) red1
*(((unsigned char *)palette) + 4) green1
*(((unsigned char *)palette) + 5) blue1
...  
*(((unsigned char *)palette) + (3 * n) - 3) redNm1
*(((unsigned char *)palette) + (3 * n) - 2) greenNm1
*(((unsigned char *)palette) + (3 * n) - 1) blueNm1
n/a
OCDFMT_RGBx24
*(((unsigned char *)palette) + 0) red0
*(((unsigned char *)palette) + 1) green0
*(((unsigned char *)palette) + 2) blue0
*(((unsigned char *)palette) + 3) 0xFF
*(((unsigned char *)palette) + 4) red1
*(((unsigned char *)palette) + 5) green1
*(((unsigned char *)palette) + 6) blue1
*(((unsigned char *)palette) + 7) 0xFF
...  
*(((unsigned char *)palette) + (4 * n) - 4) redNm1
*(((unsigned char *)palette) + (4 * n) - 3) greenNm1
*(((unsigned char *)palette) + (4 * n) - 2) blueNm1
*(((unsigned char *)palette) + (4 * n) - 1) 0xFF
*(((unsigned long *)palette) + 0) 0xFFbbggrr
   
   
   
*(((unsigned long *)palette) + 1)
0xFFbbggrr
   
   
   
...  
*(((unsigned long *)palette) + n - 1)|
0xFFbbggrr
   
   
   

NOTE:  Use of subsampled formats for paletteformat is currently undefined.

Examples

Mixing and matching bvbuffdesc and bvsurfgeom structures provides maximum flexibility for a client.

Example:  Using two different bvsurfgeom structures with the same bvbuffdesc structure allows in-place format conversion:

...
// Convert premultiplied image to non-premultiplied in place
struct bvbltparams parms;
...
struct bvbuffdesc buff;
...
struct bvsurfgeom srcgeom, dstgeom;
...
srcgeom.format = OCDFMT_RGBA24;
dstgeom.format = OCDFMT_nRGBA24;
...
parms.src1.desc = &buff;
parms.src1geom = &srcgeom;
parms.dstdesc = &buff;
parms.dstgeom = &dstgeom;
...
bv_blt(&parms);
...


Example:  Using three different bvbuffdesc structures with the same bvsurfgeom structure reduces code and copy errors:

...
// Blend two images of the same size
struct bvbltparams parms;
...
struct bvbuffdesc src1buff, src2buff, dstbuff;
...
struct bvsurfgeom geom;
...
parms.src1.desc = &src1buff;
parms.src1geom = &geom;
parms.src2.desc = &src2buff;
parms.src2geom = &geom;
parms.dstdesc = &dstbuff;
parms.dstgeom = &dstgeom;
...
bv_blt(&parms);
...



bvtileparams

This structure is used to define the parameters necessary to use a small image as a tile or block that will be repeated when used as a source.  This structure is used in conjunction with the associated bvsurfgeom and the associated bvrect to determine the operation that is performed.

struct bvcopparams {
        unsigned int structsize;
        unsigned long flags;
        void *virtaddr;
        int dstleft;
        int dsttop;
        unsigned int srcwidth;
        unsigned int srcheight;
};

bvtileparams.structsize

unsigned int structsize;

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bvtileparams.flags

unsigned long flags;

This member specifies some additional information for the tiling operation.  It can be composed as the binary OR of one selection for each edge (left, top, right, and bottom) from the following flags:

BVTILE_LEFT_REPEAT indicates that the tile is repeated to the left of the destination alignment location.
BVTILE_TOP_REPEAT indicates that the tile is repeated above the destination alignment location.
BVTILE_RIGHT_REPEAT indicates that the tile is repeated to the right of the destination alignment location.
BVTILE_BOTTOM_REPEAT indicates that the tile is repeated below the destination alignment location.
BVTILE_LEFT_MIRROR indicates that the tile is mirrored to the left of the destination alignment location.
BVTILE_TOP_MIRROR indicates that the tile is mirrored above the destination alignment location.
BVTILE_RIGHT_MIRROR indicates that the tile is mirrored to the right of the destination alignment location.
BVTILE_BOTTOM_MIRROR indicates that the tile is mirrored below the destination alignment location.

bvtileparams.virtaddr

void *virtaddr;

This member is used to indicate the CPU virtual address of the start of the buffer.

Implementations Only

Note that this is always the beginning of the buffer.  This means that if the bvsurfgeom.virtstride is negative, or the bvsurfgeom.orientation does not normalize to 0º  (i.e. orientation % 360 != 0), implementations may need to use a modified version of virtaddr internally to operate correctly.

bvtileparams.dstleft

int dstleft;

This member is used to designate the left edge of the location of the tile in the destination for alignment purposes (alignment location).  Note that the bvrect of the destination specifies the region which is filled by the tile.

bvtileparams.dsttop

int dsttop;

This member is used to designate the top edge of the location of the tile in the destination for alignment purposes (alignment location).  Note that the bvrect of the destination specifies the region which is filled by the tile.

bvtileparams.srcwidth

unsigned int srcwidth;

This member is used to designate the width of the source for purposes of scaling.  The relationship between this field and the bvrect.width of the associated source surface determines the horizontal scaling factor.

bvtileparams.srcheight

unsigned int srcheight;

This member is used to designate the height of the source for purposes of scaling.  The relationship between this field and the bvrect.height of the associated source surface determines the vertical scaling factor.


bvcallbackerror

This structure is used to provide error information to the client of a BLT that failed within an asynchronous operation.  The errors will be limited to those that occur within the implementation.

NOTE:  Parameter errors should never be returned in this structure.  These should have been returned to the client before the BLT was ever initiated.

struct bvcallbackerror {
        unsigned int structsize;
        enum bverror error;
        char *errdesc;
};

bvcallbackerror.structsize

unsigned int structsize;

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bvcallbackerror.error

enum bverror error;

This member is used to indicate the error encountered.  In general, these will be error like these:

BVERR_OP_FAILED The operation failed for unspecified reasons.  The destination buffer was not modified.
BVERR_OP_INCOMPLETE The operation only partially completed.  The destination buffer is in an undefined state.
BVERR_MEMORY_ERROR The operation resulted in a memory error, most likely due to an attempt to access invalid memory.  The destination buffer is in an undefined state.

bvcallbackerror.errdesc

char *errdesc;

errdesc is optionally used by implementations to pass a 0-terminated string with additional debugging information back to clients for debugging purposes.  errdesc is not localized or otherwise meant to provide information that is displayed to users.


Batching

Batching is the single most powerful feature in BLTsville.  It is used for two major purposes:

  1. To group similar BLTs which use most of the same parameters so that they can be handled more efficiently by the implementation.
  2. To group BLTs that should go together so that implementations can use special features that go beyond what seems to be expressed by the BLTsville API.

NOTE:  It is important to realize that BLTs batched together may be done in any order, and in fact may not even be done in the way specified.  This includes the BLTs being done as they are submitted, or no operations performed until the batch submission is completed with BVFLAG_BATCH_END.  This means the client must not rely on intermediate results within a batch.

NOTE:  Because BLTs can be performed in a variety of ways, callbacks for individual BLTs would have no consistent meaning.  So, when batching is mixed with BVFLAG_ASYNC, only the callback for the last BLT occurs.

NOTE:  Since implementations can perform batched BLTs in a variety of ways, even synchronous batched BLTs can be effectively asynchronous.  Therefore, only the last BLT determines the synchronicity of the entire batch.  i.e. the BVFLAG_ASYNC flag is only heeded when combined with BVFLAG_BATCH_END.

NOTE: Failure during the performance of a batch (different from an error on submission--indicated by the contents of the bvcallbackerror structure) will result in an unknown state for all destination buffers.  Do not assume that a given implementation's state in this case represents the state which will be encountered for a different implementation.

NOTE: Because of the indeterminate nature of the execution of a batch of BLTs, a "batch abort" would not result in a known state either.  As stated above, a given implementation may have already performed earlier BLTs in a batch as the batch is submitted.  So errors encountered during the submission of a batch must be handled by the client, and then the batch must be terminated normally using BVFLAG_BATCH_END.

Batches For Grouping Similar BLTs

Often, groups of similar BLTs are performed, with changes to only a few parameters.  Some implementations have the ability to re-use previous settings, coupled with these changes, to perform new BLTs.

One good example of this in in rendering text, similar to that you are reading now.  In most systems, a glyph cache is maintained to hold the characters of a given font, rasterized with the specific characteristics desired (e.g. bold, italics, etc.).  Each font in the glyph cache is normally created using a font rasterization engine from a vector-based font, such as FreeType.  This technology allows fonts to be described in terms of curves and lines instead of pixels, which means they can be created as needed, in any size desirable.

  !"#$%&'()*+'-./0123456789:;<=>?
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_
`abcdefghijklmnopqrstuvwxyz{|}~

Then, when a character needs to be rendered, it is copied from the pre-rendered glyph cache.  This is much more efficient than performing the font rasterization from the vector description each time a character is used.

With some hardware implementations, the setup to trigger the copy of these characters from the glyph cache to the target surface can be quite significant, when compared to the number of pixels actually affected.  For example, each character might consist of something on the order of  10 x 14, or about 140 pixels.  Programming a typical hardware BLTer may require tens of commands for each character.

But note that each of these BLTs differs by only a few parameters.  Specifically, once the source and destination surfaces have been specified, and the operation described, only the source and destination rectangles change between BLTs. To alleviate much of this overhead, most implementations will allow the configuration of a previous BLT to be used again, with only those parameters which change provided for the subsequent BLTs.

BLTsville provides access to this capability via the batch mechanism.

For rendering a word using a monospaced font like this, the client might construct the batch like this:

struct bvbuffdesc screendesc = {sizeof(struct bvbuffdesc}, 0};
struct bvsurfgeom screengeom = {sizeof(struct bvsurfgeom), 0};
struct bvbuffdesc glyphcachedesc = {sizeof(struct bvbuffdesc), 0};
struct bvsurfgeom glyphcachegeom = {sizeof(struct bvsurfgeom), 0};
struct bvtileparams solidcolortileparams = {sizeof(struct bvtileparams), 0};
struct bvbuffgeom solidcolorgeom = {sizeof(struct bvsurfgeom), 0};

struct bvbltparams bltparams = {sizeof(struct bvbltparams), 0};

int charsperline = 32;
int fontwidth = 10;
int fontheight = 14;
int i = 0;

screendesc.virtaddr = screenaddr;
screendesc.length = screenstride * screenheight;
screengeom.format = OCDFMT_RGB24;
screengeom.width = screenwidth;
screengeom.height = screenheight;
screengeom.virtstride = screenstride;

glyphcachedesc.virtaddr = glyphcacheaddr;
glyphcachedesc.length = glyphcachestride * glyphcacheheight;
glyphcachegeom.format = OCDFMT_ALPHA8;
glyphcachegeom.width = glyphcachewidth;
glyphcachegeom.height = glyphcacheheight;
glyphcachegeom.virtstride = glyphstride;

solidcolortileparams.virtaddr = &solidcolor;
solidcolortileparams.srcwidth = 1;
solidcolortileparams.srcheight = 1;
solidcolorgeom.format = OCDFMT_RGB24;

bltparams.flags = BVFLAG_BLEND | BVFLAG_SRC1_TILED | BVFLAG_BATCH_BEGIN;
bltparams.op.blend = BVBLEND_SRCOVER + BVBLENDDEF_REMOTE;
bltparams.dstdesc = &screendesc;
bltparams.dstgeom = &screengeom;
bltparams.src1.tileparams = &solidcolortileparams;
bltparams.src1geom = &solidcolorgeom;
bltparams.src2.desc = &screendesc;
bltparams.src2geom = &screengeom;
bltparams.mask.desc = &glyphcachedesc;
bltparams.maskgeom = &glyphcachegeom;

bltparams.dstrect.left = bltparams.src2rect.left = screenrect.left;
bltparams.dstrect.top = bltparams.src2rect.top = screenrect.top;

bltparams.maskrect.width = bltparams.dstrect.width = bltparams.src2rect.width = fontwidth;
bltparams.maskrect.height = bltparams.dstrect.height = bltparams.src2rect.height = fontheight;

bltparams.maskrect.left = ((text[i] - ' ') % charsperline) * fontwidth;
bltparams.maskrect.top = ((text[i] - ' ') / charsperline) * fontheight;

bv_blt(&bltparams);

i++;
if(i < textlen)
{
  bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_CONTINUE;
  bltparams.batchflags = BVBATCH_DSTRECT_ORIGIN | BVBATCH_SRC2RECT_ORIGIN | BVBATCH_MASKRECT_ORIGIN;

  do
  {
    bltparams.dstrect.left += fontwidth;
    bltparams.src2rect.left = bltparams.dstrect.left;

    bltparams.maskrect.left = ((text[i] - ' ') % charsperline) * fontwidth;
    bltparams.maskrect.top = ((text[i] - ' ') / charsperline) * fontheight;

    bv_blt(&bltparams);

    i++;
  }while(i < textlen);
}

bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_END;
bltparams.batchflags = BVBATCH_ENDNOP;

bv_blt(&bltparams);

NOTE:  bvbltparams.batchflags is just a hit.  Not all implementations support deltas in batching, so clients must not change the values of members of bvbltparams (or structures it references) between BLTs.  These values may be used.

Batches For Special Feature BLTs

Enabling special features of some implementations is a special challenge.  But BLTsville is up the task.

For example, perhaps an implementation is capable of blending four layers at the same time.  But BLTsville only allows blending to be specified using two layers at a time.  How can this be accomplished?

The most prevalent blending reference used is the Porter-Duff whitepaper, which specifies blending of two sources (A and B).  So any N-source blend (N > 2) would require the blends to be specified as a grouping of N - 1 two-source blends in order to utilize the Porter-Duff equations.  That's how such a blend is specified in BLTsville:

bltparams.dstrect.width = bltparams.src1rect.width = bltparams.src2rect.width = dstgeom.width;
bltparams.dstrect.height = bltparams.src1rect.height = bltparams.src2rect.height = dstgeom.height;

bltparams.flags = BVFLAG_BLEND | BVFLAG_BATCH_BEGIN;
bltparams.op.blend = BVBLEND_SRCOVER;
bltparams.dstdesc = &dstdesc;
bltparams.dstgeom = &dstgeom;
bltparams.src1.desc = &src1desc;
bltparams.src1geom = &src1geom;
bltparams.src2.desc = &src2desc;
bltparams.src2geom = &src2geom;

bv_blt(&bltparams);

bltparams.src1.desc = &src3desc;
bltparams.src1geom = &src3geom;
bltparams.dstdesc = &dstdesc;
bltparams.dstgeom = &dstgeom;

bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_CONTINUE;
bltparams.batch = BVBATCH_SRC1 | BVBATCH_SRC2;

bv_blt(&bltparams);

bltparams.src1.desc = &src4desc;
bltparams.src1geom = &src4geom;

bltparams.flags = (bltparams.flags & ~BVFLAG_BATCH_MASK) | BVFLAG_BATCH_END;
bltparams.batch = BVBATCH_SRC1;

bv_blt(&bltparams);

The driver for an implementation that can perform this pair of operations as one BLT would be tasked with recognizing that the batch contained BLTs which can be combined.

The fantastic thing about this approach is that an implementation without the ability to blend N sources in one pass would perform the blends separately, but the result would be identical.  Moreover, implementations with the ability to combine different numbers of operations would likewise produce the same results, even they they used a different number of internal steps.  Here's an example:

Number of
Layers to
Blend
BLTsville
Operations
Implementation
Capable of
Blending One
Source with a
Destination

(2 inputs)
Implementation
Capable of
Blending Two
Sources to a
Destination

(2 inputs)
Implementation
Capable of
Blending Four
Sources to a
Destination

(4 inputs)
Implementation
Capable of
Blending Eight
Sources with
a Destination

(5 inputs)
2 A over B => O B => O
A over O => O
A over B => O A over B => O A over B => O
3 B over C => O
A over O => O
C => O
B over O => O
A over O => O
B over C => O
A over O => O
A over B over C => O A over B over C => O
4 C over D => O
B over O => O
A over O => O
D => O
C over O => O
B over O => O
A over O => O
C over D => O
B over O => O
A over O => O
A over B over C over D => O A over B over C over D => O
5 D over E => O
C over O => O
B over O => O
A over O => O
E => O
D over O => O
C over O => O
B over O => O
A over O => O
D over E => O
C over O => O
B over O => O
A over O => O
D over E => O
A over B over C over O => O
E => O
A over B over C over D over O => O
6 E over F => O
D over O => O
C over O => O
B over O => O
A over O => O
F => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
E over F => O
D over O => O
C over O => O
B over O => O
A over O => O
D over E over F => O
A over B over C over O => O
E over F => O
A over B over C over D over O => O
7 F over G => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
G => O
F over O => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
F over G => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
D over E over F over G => O
A over B over C over O => O
E over F over G => O
A over B over C over D over O => O
8 G over H => O
F over O => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
H => O
G over O => O
F over O => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
G over H => O
F over O => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
G over H => O
D over E over F over O => O
A over B over C over O => O
E over F over G over H => O
A over B over C over D over O => O
9 H over I => O
G over O => O
F over O => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
I => O
H over O => O
G over O => O
F over O => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
H over I => O
G over O => O
F over O => O
E over O => O
D over O => O
C over O => O
B over O => O
A over O => O
G over H over I => O
D over E over F over O => O
A over B over C over O => O
I => O
E over F over G over H over O => O
A over B over C over D over O => O
Comparison of batched BLTsville calls with internal operations, based on implementation capabilities.

NOTE: As mentioned above a batch of BLTs may be serviced in any number of ways.  In this example, the destination buffer may be used for intermediate results, so it is important that this buffer not be used during the batch--i.e. as a displayed buffer.


Where to Start

(Note that error checking is omitted in all the examples below for clarity.)

1.  Clients begin by opening one or more BLTsville implementations dynamically.  The specific method of doing this is dependent on the operating system.  For example, Linux might do this like this:

struct bltsvillelib
{
  char* name;
  void* handle;
  BVFN_MAP bv_map;
  BVFN_BLT bv_blt;
  BVFN_UNMAP bv_unmap;
};

struct bltsville bvlib[] =
{
  { "libbltsville_cpu.so", 0 },
  { "libbltsville_2d.so", 0 }
};
const int NUMBVLIBS = sizeof(bvlib) / sizeof(struct bltsvillelib);

for(int i = 0; i < NUMLIBS; i++)
{
  bvlib[i].handle = dlopen(bvlib[i].name, RTLD_LOCAL | RTLD_LAZY);
  bvlib[i].bv_map = (BVFN_MAP)dlsym(bvlib[i].handle, "bv_map");
  bvlib[i].bv_blt = (BVFN_BLT)dlsym(bvlib[i].handle, "bv_blt");
  bvlib[i].bv_unmap = (BVFN_BLT)dlsym(bvlib[i].handle, "bv_unmap");
}

2.  Clients then need to create a bvbuffdesc object for each buffer to be accessed in BLTsville:

struct bvbuffdesc buff =
  {sizeof(struct bvbuffdesc), 0};

buff.virtaddr = buffptr;
buff.length = bufflength;

 or 

struct bvbuffdesc buff;

memset(&buff, 0, sizeof(buff));
buff.structsize = sizeof(buff);
buff.virtaddr = buffptr;
buff.length = bufflength;

Note that the client must ensure that the map element and any additional members in bvbuffdesc are initialized to 0.

3.  Next the buffer can be mapped to give the hardware implementations a chance to associate any necessary resources with the buffer:

/* do nothing */

 or 

bvlib[0].bv_map(&buff);

 or 

for(int i = 0; i < NUMLIBS; i++)
{
  if(bvlib[i].bv_map)
    bvlib[i].bv_map(&buff);
}


a. This step is actually optional, as indicated above.  However, if the client does not explicitly call bv_map(), the mapping must be done by the implementation to associate the necessary resources with the buffer.  So this mapping must be done later, when bv_blt() is called.  Additionally, since the client did not call bv_map(), it is unlikely that the client will call bv_unmap() to allow the implementation to free the resources associated with the buffer.  So the implementation will internally unmap the resources after completing the BLT.  This means that the mapping and unmapping overhead will be encountered on every call to bv_blt().

In general, the CPU implementations have (almost) no overhead associated with mapping and unmapping.  So opting not to make the bv_map() call for CPU implementations is likely to have negligible difference in bv_blt() performance.
b. Calling bv_map() once for each buffer is enough to tell the implementations that the client can be trusted to call bv_unmap() when work with the buffer is complete, as indicated above.  It does not matter which implementation's bv_map() is called.  However, that implementation is the only one which will perform the mapping immediately.  All other implementations will perform a lazy mapping only when their bv_blt() call is invoked.

This allows the client to avoid the overhead of mapping and unmapping the buffers on each bv_blt() call.  It also avoids the associated mapping and unmapping overhead if a given implementation is never used.

As mentioned above, the CPU implementations have (almost) no overhead associated with mapping and unmapping, so they are a good choice to use for the call to bv_map().
c. If the client wants direct control over the mapping and unmapping overhead, it can call the bv_map() function of each implementation, as indicated above.  Each implementation will perform the mapping at that time, so that the overhead will not appear on subsequent calls to bv_blt().

4.  Next the client must create bvsurfgeom objects for each way in which a buffer will be accessed.  Often, there is only one way in which a buffer is accessed, so there will be the same number of buffers, bvbuffdesc, and bvsurfgeom objects.  If that's the case, it may be convenient for the client to combine them into a parent structure.  It may even be possible to share a single bvbuffgeom structure among buffers.  Or there will be times when it is necessary to treat a buffer in different ways for different BLTs.  Having these two structures separated allows all of these combinations.

struct bvsurfgeom geom =
  {sizeof(struct bvsurfgeom), 0};

geom.format = OCDFMT_RGB24;
geom.width = width;
geom.height = height;
geom.virtstride = stride;

 or 

struct bvsurfgeom geom;
memset(&geom, 0, sizeof(geom));
geom.structsize = sizeof(geom);
geom.width = width;
geom.height = height;
geom.virtstride = stride;

Note that the client must ensure that any additional members in bvsurfgeom are initialized to 0 for future compatibility.

5.  Now the client is ready to fill in a bvbltparams structure to specify the type of BLT requested.  Here is an example of a simple copy from the lower right corner of a surface to the upper left:

struct bvbltparams bltparams = {sizeof(struct bvbltparams), 0};

bltparams.flags = BVFLAG_ROP;
bltparams.op.rop = 0xCCCC; /* SRCCOPY */
bltparams.dstdesc = &buff;
bltparams.dstgeom = &geom;
bltparams.dstrect.left = 0;
bltparams.dstrect.top = 0;
bltparams.dstwidth = width / 2;
bltparams.dstheight = height / 2;
bltparams.src1.desc = &buff;
bltparams.src1geom = &geom;
bltparams.src1rect.left = width / 2;
bltparams.src1rect.top = height / 2;
bltparams.src1rect.width = width / 2;
bltparams.src1rect.height = height / 2;

6.  And next the client can trigger the BLT by calling bv_blt():

bv_blt(&bltparams);

If the client cannot complete the requested BLT, it returns a bverror indicating the issue.

7.  Finally, the client should clean up:

bv_unmap(&buff);


Kernel Mode Interface

The kernel mode interface differs only slightly from the user mode interface.  Currently there are two differences in the general kernel interface, and one in the Linux/Android interface:

bvbuffdesc.auxtype/auxptr

bvbuffdesc.auxtype is an enum, indicating the type of the bvbuffdesc.auxptr.  The enumeration values and the associated types are:

bvbuffdesc.auxtype bvbuffdesc.auxptr type Notes
BVAT_PHYSDESC bvphysdesc Used to specify the physical pages of a physically discontiguous buffer constructed using a single page size.  This may be used with physically contiguous buffers as well, but BVAT_PHYSADDR is preferred.
BVAT_PHYSADDR physical address Used to specify the starting physical address of a physically contiguous buffer.

The methods of describing the buffer using physical addresses is not exposed in user mode for security reasons.


bvphysdesc

struct bvphysdesc {
        unsigned int structsize;
        unsigned long pagesize;
        unsigned long *pagearray;
        unsigned int pagecount;
        unsigned long pageoffset;
};

bvphysdesc.structsize

unsigned int structsize;

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bvphysdesc.pagesize

unsigned long pagesize;

This member indicates the size of the physical pages containing the buffer.  BVAT_PHYSDESC/bvphysdesc does not support buffers which reside in pages that are not all the same size.  bvphysdesc.pagesize is used to indicate the length of the pages in the bvphysdesc.pagearray as well as the expected alignment of those pages.  If this value is 0, the default page size of the system is assumed.

NOTE:  When used with physically contiguous buffers, this member should be set to the length of the buffer, which is the same as the value in bvbuffdesc.length.

bvphysdesc.pagearray

unsigned long *pagearray;

This member is an array of unsigned longs holding the physical addresses of the pages holding the buffer.  The array contains pagecount entries.  The specific format of the physical addresses is O/S dependent.  However, BVAT_PHYSDESC/bvphysdesc only supports 32-bit physical addresses.

Addresses in this array must be aligned on bvphysdesc.pagesize boundaries.  Use the bvphysdesc.pageoffset member to indicate the offset from the start of the first page to the beginning of the buffer.

NOTE:  When used with physically contiguous buffers, the first (only) address in this array should be aligned on the system default page boundary, and the bvphysdesc.pageoffset member should be used to indicate the offset from that address to the beginning of the buffer.

bvphysdesc.pagecount

unsigned int pagecount;

This member indicates the number of pages in the array pointed to by bvphysdesc.pagearray.

NOTE:  When used with physically contiguous buffers, this member should be set to 1.

bvphysdesc.pageoffset

unsigned long pageoffset;

This member indicates the number of bytes from the start of the first page (*pagearray) to the start of the buffer.  The value must be less than bvphysdesc.pagesize.

Implementations Only

Implementations should not ignore this member.


bventry

Kernel mode entry cannot be the same as the user mode.  The specific method of accessing the kernel interface is O/S specific.  However, the following interface is currently defined for the specified O/Ss:

Linux/Android

bventry

This structure is used to obtain the pointers to the implementation's BLTsville calls.  The client can call the default bv2d_entry() function to obtain the pointers to the implementation chosen by the system integrators, or it can call a specific function to get the pointers for a specific implementation (e.g. gcbv_entry()).

struct bventry {
        unsigned int structsize;
        BVFN_MAP bv_map;
        BVFN_UNMAP bv_unmap;
        BVFN_BLT bv_blt;
        BVFN_CACHE bv_cache;
};

bventry.structsize

unsigned int structsize;

This member is used for compatibility between BLTsville versions.  (See bvbltparams.structsize for an explanation.)

bventry.bv_map/bv_unmap/bv_blt/bv_cache

BVFN_MAP bv_map;
BVFN_UNMAP bv_unmap;
BVFN_BLT bv_blt;
BVFN_CACHE bv_cache;

These members hold pointers to the functions for the specific implementation queried with a call to *_entry().

NOTE:  bv_cache() is optional, so this pointer may be set to 0.




Linux/Android Deviation

Although the linked list used in the bvbuffmap structure is not complicated, there may be a requirement to use the standard Linux/Android kernel linked list in that environment.  To facilitate this, the bvbuffmap.map entry is replaced by the following entry for Linux/Android kernel mode only:

bvbuffmap.node

struct list_head node;

This member is used to reference the containing linked list for the bvbuffmap structures associated with the buffer.