DRI3000 — Even Better Direct Rendering

This all started with the presentation that Eric Anholt and I did at the 2012 X developers conference, and subsequently wrote about in my DRI-Next posting. That discussion sketched out the goals of changing the existing DRI2-based direct rendering infrastructure.

Last month, I gave a more detailed presentation at Linux.conf.au 2013 (the best free software conference in the world). That presentation was recorded, so you can watch it online. Or, you can read Nathan Willis' summary at lwn.net. That presentation contained a lot more details about the specific techniques that will be used to implement the new system, in particular it included some initial indications of what kind of performance benefits the overall system might be able to produce.

I sat down today and wrote down an initial protocol definition for two new extensions (because two extensions are always better than one). Together, these are designed to provide complete support for direct rendering APIs like OpenGL and offer a better alternative to DRI2.

The DRI3 extension

Dave Airlie and Eric Anholt refused to let me call either actual extension DRI3000, so the new direct rendering extension is called DRI3. It uses POSIX file descriptor passing to share kernel objects between the X server and the application. DRI3 is a very small extension in three requests:

  1. Open. Returns a file descriptor for a direct rendering device along with the name of the driver for a particular API (OpenGL, Video, etc).

  2. PixmapFromBuffer. Takes a kernel buffer object (Linux uses DMA-BUF) and creates a pixmap that references it. Any place a Pixmap can be used in the X protocol, you can now talk about a DMA-BUF object. This allows an application to do direct rendering, and then pass a reference to those results directly to the X server.

  3. BufferFromPixmap. This takes an existing pixmap and returns a file descriptor for the underlying kernel buffer object. This is needed for the GL Texture from Pixmap extension.

For OpenGL, the plan is to create all of the buffer objects on the client side, then pass the back buffer to the X server for display on the screen. By creating pixmaps, we avoid needing new object types in the X server and can use existing X apis that take pixmaps for these objects.

The Swap extension

Once you've got direct rendered content in a Pixmap, you'll want to display it on the screen. You could simply use CopyArea from the pixmap to a window, but that isn't synchronzied to the vertical retrace signal. And, the semantics of the CopyArea operation precludes us from swapping the underlying buffers around, making it more expensive than strictly necessary.

The Swap extension fills those needs. Because the DRI3 extension provides an X pixmap reference to the direct rendered content, the Swap extension doesn't need any new object types for its operation. Instead, it talks strictly about core X objects, using X pixmaps as the source of the new data and X drawables as the destination.

The core of the Swap extension is one request — SwapRegion. This request moves pixels from a pixmap to a drawable. It uses an X fixes Region object to specify the area of the destination being painted, and an offset within the source pixmap to align the two areas.

A bunch of data are included in the reply from the SwapRegion request. First, you get a 64-bit sequence number identifying the swap itself. Then, you get a suggested geometry for the next source pixmap. Using the suggested geometry may result in performance improvements from the techniques described in the LCA talk above.

The last bit of data included in the SwapRegion reply is a list of pixmaps which were used as source operands to earlier SwapRegion requests to the same drawable. Each pixmap is listed along with the 64-bit sequence number associated with an earlier SwapRegion operation which resulted in the contents which the pixmap now contains. Ok, so that sounds really confusing. Some examples are probably necessary.

  • If the SwapRegion operation was implemented by copying data out of the source pixmap into the destination drawable, then the idle swap count will be equal to the swap count from this SwapRegion operation.

  • If the SwapRegion operation was implemented by swapping the destination contents with the source contents, then the idle swap count will be equal to the previous swap count on the destination drawable.

I'm hoping you'll be able to tell that in both cases, the idle swap count tries to name the swap sequence at which time the destination drawable contained the contents currently in the pixmap.

Note that even if the SwapRegion is implemented as a Copy operation, the provided source pixmap may not be included in the idle list as the copy may be delayed to meet the synchronization requirements specfied by the client.

Finally, if you want to throttle rendering based upon when frames appear on the screen, Swap offers an event that can be delivered to the drawable after the operation actually takes place.

Because the Swap extension needs to supply all of the OpenGL SwapBuffers semantics (including a multiplicity of OpenGL extensions related to that), I've stolen a handful of DRI2 requests to provide the necessary bits for that:

  1. SwapGetMSC
  2. SwapWaitMSC
  3. SwapWaitSBC

These work just like the DRI2 requests of the same names.

Current State of the Extensions

Both of these extensions have an initial protocol specification written down and stored in git:

  1. DRI3 protocol

  2. Swap protocol