DRM leasing part three (Vulkan)

With the kernel APIs off for review, and the X RandR bits looking like they're in reasonable shape, I finally found some time to sit down and figure out how I wanted to integrate this into Vulkan.

Avoiding two DRM file descriptors

Given that a DRM lease is represented by a DRM master file descriptor, we want to use that for all of the operations in the driver, including rendering and mode setting. Using the vulkan driver render node and the lease master node together would require passing buffer objects between the kernel contexts using even more file descriptors.

The Mesa Vulkan drivers open the device nodes while enumerating devices, not when they are created. This seems a bit early to me, but it makes sure that the devices being enumerated are actually available for use, and not just present in the system. To replace the render node fd with the lease master fd means hooking into the system early enough that the enumeration code can see the lease fd. And that means creating an instance extension as the instance gets created before devices are enumerated.

The VK_KEITHP_kms_display instance extension

This simple instance extension provides the necessary hooks to get the lease information from the application down into the driver before the DRM node is opened. In the first implementation, I added a function that could be called before the devices were enumerated to save the information in the Vulkan loader. That worked, but required quite a bit of study of the Vulkan loader and its XML description of the full Vulkan API.

Mark Young suggested that a simpler plan would be to chain the information into the VkInstanceCreateInfo pNext field; with no new APIs added to Vulkan, there shouldn't be any need to change the Vulkan loader -- the device driver would advertise the new instance extension and the application could find it.

That would have worked great, except the Vulkan loader 'helpfully' elides all instance extensions it doesn't know about before returning the list to the application. I'd say this was a bug and should be fixed, but for now, I've gone ahead and added the few necessary definitions to the loader to make it work.

In the application, it's a simple matter of searching for this extension, constructing the VkKmsDisplayInfoKEITHP structure, chaining that into the VkInstanceCreateInfo pNext list and passing that in to the vkCreateInstance call.

typedef struct VkKmsDisplayInfoKEITHP {
    VkStructureType         sType;  /* VK_STRUCTURE_TYPE_KMS_DISPLAY_INFO_KEITHP */
    const void*             pNext;
    int                     fd;
    uint32_t                crtc_id;
    uint32_t                *connector_ids;
    int                     connector_count;
    drmModeModeInfoPtr      mode;
} VkKmsDisplayInfoKEITHP;

As you can see, this includes the master file descriptor along with all of the information necessary to set the desired video mode using the specified resources.

The driver just walks the pNext list from the VkInstanceCreateInfo structure looking for any provided VkKmsDisplayInfoKEITHP structure and pulls the data out.

To avoid questions about file descriptor lifetimes, the driver dup's the provided fd. The application is expected to close their copy at a suitable time.

The VK_KHR_display extension

Vulkan already has an API for directly accessing the raw device, including code for exposing video modes and everything. As tempting as it may be to just go do something simpler, there's a lot to be said for using existing APIs.

This extension doesn't provide any direct APIs for acquiring display resources, relying on the VK_EXT_acquire_xlib_display extension for that part. And that takes a VkPhysicalDisplay parameter, which is only available after the device is opened, which is why I created the VK_KEITHP_kms_display extension instead of just using the VK_EXT_acquire_xlib_display extension -- we can't increase the capabilities of the render node opened by the driver, and we don't want to keep two file descriptors around.

With the information provided by the VK_KEITHP_kms_display extension, we can implement all of the VK_KHR_display extension APIs, including enumerating planes and modes and creating the necessary display surface. Of course, there's only one plane and one mode, so some of the implementation is pretty simplistic.

The big piece of work was to create the swap chain structure and associated frame buffers.

A working example

I've taken the 'cube' example from the Vulkan loader and hacked it up to use XCB to construct a DRM lease, the VK_KEITHP_kms_display extension to pass that lease into the Vulkan driver. The existing support for the VK_KHR_display extension "just worked", which was pretty satisfying.

It's a bit of a mess

I'm not satisfied with the mesa code at this point; there's a bunch of code in the radeon driver which should be in the vulkan WSI bits, and the vulkan WSI bits should probably not have the KMS interfaces wired in. I'll ask around and see what other Mesa developers think I should do before restructuring it further; I'll probably have to rewrite it all at least one more time before it's ready to upstream.

Seeing the code

I'll be cleaning the code up a bit before sending it out for review, but it's already visible in my own repositories:

Posted Tue May 30 01:41:22 2017 Tags: tags/fdo

DRM leasing part deux (kernel side)

I've stabilized the kernel lease implementation so that leases now work reliably, and don't require a reboot before running the demo app a second time. Here's whats changed:

  1. Reference counting is hard. I'm creating a new drm_master, which means creating a new file. There were a bunch of reference counting errors in my first pass; I've cleaned those up while also reducing the changes needed to the rest of the DRM code.

  2. Added a 'mask_lease' value to the leases -- this controls whether resources in the lease are hidden from the lessor, allowing the lessor to continue to do operations on the leased resources if it likes.

  3. Hacked the mutex locking assertions to not crash in normal circumstances. I'm now doing:

    BUG_ON(__mutex_owner(&master->dev->mode_config.idr_mutex) != current);

    to make sure the mutex is held by the current thread, instead of just making sure some thread holds the mutex. I have this memory of a better way to do this, but now I can't dig it up. Suggestions welcome, of course.

I'm reasonably pleased with the current state of the code, although I want to squash the patches together so that only the final state of the design is represented, rather than the series of hacks present now.

Comments on #dri-devel

I spent a bit of time on the #dri-devel IRC channel answering questions about the DRM-lease design and the changes above reflect that.

One concern was about mode setting from two masters at the same time. Mode setting depends on a number of shared 'hidden' resources; things like memory fifos and the like. If either lessee or lessor wants to change the displayed modes, they may fail due to conflicts over these resources and not be able to recover easily.

A related concern was that the current TEST/render/commit mechanism used by compositors may no longer be reliable as another master could change hidden resource usage between the TEST and commit operations. Daniel Vetter suggested allowing the lessor to 'exclude' lessee mode sets during this operation.

One solution would be to have the lessor set the mode on the leased resources before ceding control of the objects, and once set, the lessee shouldn't perform additional mode setting operations. This would limit the flexibility in the lessee quite a bit as it wouldn't be able to pop up overlay planes or change resolution.


Let's review the questions from my last post, DRM-lease:

  • What should happen when a Lessor is closed? Should all access to controlled resources be revoked from all descendant Lessees?

    Answer: The lessor is referenced by the lessee and isn't destroyed until it has been closed and all lessees are destroyed.

  • How about when a Lessee is closed? Should the Lessor be notified in some way?

    Answer: I think so? Need to figure out a mechanism here.

  • CRTCs and Encoders have properties. Should these properties be automatically included in the lease?

    Answer -- no, userspace is responsible for constructing the entire lease.

Remaining Kernel Work

The code is running, and appears stable. However, it's not quite done yet. Here's a list of remaining items that I know about:

  1. Changing leases should update sub-leases. When you reduce the resources in one lease, the kernel should walk any sub-leases and clear out resources which the lessor no longer has access to.

  2. Sending events when leases are created/destroyed. When a lease is created, if the mask_lease value is set, then the lessor should get regular events describing the effective change. Similarly, both lessor and lessee should get events when a lease is changed.

  3. Refactoring the patch series to squash intermediate versions of the new code.

Remaining Other Work

Outside of the kernel, I'll be adding X support for this operation. Here's my current thinking:

  • Extend RandR to add a lease-creation request that returns a file descriptor. This will take a mode and the server will set that mode before returning.

  • Provide some EDID-based matching for HMD displays in the X server. The goal is to 'hide' these from the regular desktop so that the HMD screen doesn't flicker with desktop content before the lease is created. I think what I want is to let an X client provide some matching criteria and for it to get an event when a output is connected with matching EDID information.

With that, I think this phase of the project will be wrapped up and it will be time to move on to actually hooking up a real HMD and hacking the VR code to use this new stuff.

Seeing the code

For those interested in seeing the state of the code so far, there's kernel, drm and kmscube repositories here:

Posted Fri Mar 31 17:25:22 2017 Tags: tags/fdo

DRM display resource leasing (kernel side)

So, you've got a fine head-mounted display and want to explore the delights of virtual reality. Right now, on Linux, that means getting the window system to cooperate because the window system is the DRM master and holds sole access to all display resources. So, you plug in your device, play with RandR to get it displaying bits from the window system and then carefully configure your VR application to use the whole monitor area and hope that the desktop will actually grant you the boon of page flipping so that you will get reasonable performance and maybe not even experience tearing. Results so far have been mixed, and depend on a lot of pieces working in ways that aren't exactly how they were designed to work.

We could just hack up the window system(s) and try to let applications reserve the HMD monitors and somehow removing them from the normal display area so that other applications don't randomly pop up in the middle of the screen. That would probably work, and would take advantage of much of the existing window system infrastructure for setting video modes and performing page flips. However, we've got a pretty spiffy standard API in the kernel for both of those, and getting the window system entirely out of the way seems like something worth trying.

I spent a few hours in Hobart chatting with Dave Airlie during LCA and discussed how this might actually work.


  1. Use KMS interfaces directly from the VR application to drive presentation to the HMD.

  2. Make sure the window system clients never see the HMD as a connected monitor.

  3. Maybe let logind (or other service) manage the KMS resources and hand them out to the window system and VR applications.


  1. Don't make KMS resources appear and disappear. It turns out applications get confused when the set of available CRTCs, connectors and encoders changes at runtime.

An Outline for Multiple DRM masters

By the end of our meeting in Hobart, Dave had sketched out a fairly simple set of ideas with me. We'd add support in the kernel to create additional DRM masters. Then, we'd make it possible to 'hide' enough state about the various DRM resources so that each DRM master would automagically use disjoint subsets of resources. In particular, we would.

  1. Pretend that connectors were always disconnected

  2. Mask off crtc and encoder bits so that some of them just didn't seem very useful.

  3. Block access to resources controlled by other DRM masters, just in case someone tried to do the wrong thing.

Refinement with Eric over Swedish Pancakes

A couple of weeks ago, Eric Anholt and I had breakfast at the original pancake house and chatted a bit about this stuff. He suggested that the right interface for controlling these new DRM masters was through the existing DRM master interface, and that we could add new ioctls that the current DRM master could invoke to create and manage them.

Leasing as a Model

I spent some time just thinking about how this might work and came up with a pretty simple metaphor for these new DRM masters. The original DRM master on each VT "owns" the output resources and has final say over their use. However, a DRM master can create another DRM master and "lease" resources it has control over to the new DRM master. Once leased, resources cannot be controlled by the owner unless the owner cancels the lease, or the new DRM master is closed. Here's some terminology:

DRM Master
Any DRM file which can perform mode setting.
The original DRM Master, created by opening /dev/dri/card*
A DRM master which has leased out resources to one or more other DRM masters.
A DRM master which controls resources leased from another DRM master. Each Lessee leases resources from a single Lessor.
Lessee ID
An integer which uniquely identifies a lessee within the tree of DRM masters descending from a single Owner.
The contract between the Lessor and Lessee which identifies which resources which may be controlled by the Lessee. All of the resources must be owned by or leased to the Lessor.

With Eric's input, the interface to create a lease was pretty simple to write down:

int drmModeCreateLease(int fd,
               const uint32_t *objects,
               int num_objects,
               int flags,
               uint32_t *lessee_id);

Given an FD to a DRM master, and a list of objects to lease, a new DRM master FD is returned that holds a lease to those objects. 'flags' can be any combination of O_CLOEXEC and O_NONBLOCK for the newly minted file descriptor.

Of course, the owner might want to take some resources back, or even grant new resources to the lessee. So, I added an interface that rewrites the terms of the lease with a new set of objects:

int drmModeChangeLease(int fd,
               uint32_t lessee_id,
               const uint32_t *objects,
               int num_objects);

Note that nothing here makes any promises about the state of the objects across changes in the lease status; the lessor and lessee are expected to perform whatever modesetting is required for the objects to be useful to them.

Window System Integration

There are two ways to integrate DRM leases into the window system environment:

  1. Have logind "lease" most resources to the window system. When a HMD is connected, it would lease out suitable resources to the VR environment.

  2. Have the window system "own" all of the resources and then add window system interfaces to create new DRM masters leased from its DRM master.

I'll probably go ahead and do 2. in X and see what that looks like.

One trick with any of this will be to hide HMDs from any RandR clients listening in on the window system. You probably don't want the window system to tell the desktop that a new monitor has been connected, have it start reconfiguring things, and then have your VR application create a new DRM master, making the HMD appear to have disconnected to the window system and have that go reconfigure things all over again.

I'm not sure how this might work, but perhaps having the VR application register something like a passive grab on hot plug events might make sense? Essentially, you want it to hear about monitor connect events, go look to see if the new monitor is one it wants, and if not, release that to other X clients for their use. This can be done in stages, with the ability to create a new DRM master over X done first, and then cleaning up the hotplug stuff later on.

Current Status

I hacked up the kernel to support the drmModeCreateLease API, and then hacked up kmscube to run two threads with different sets of KMS resources. That ran for nearly a minute before crashing and requiring a reboot. I think there may be some locking issues with page flips from two threads to the same device.

I think I also made the wrong decision about how to handle lessors closing down. I tried to let the lessors get deleted and then 'orphan' the lessees. I've rewritten that so that lessees hold a reference on their lessor, keeping the lessor in place until the lessee shuts down. I've also written the kernel parts of the drmModeChangeLease support.


  • What should happen when a Lessor is closed? Should all access to controlled resources be revoked from all descendant Lessees?

    Proposed answer -- lessees hold a reference to their lessor so that the entire tree remains in place. A Lessor can clean up before exiting by revoking lessee access if it chooses.

  • How about when a Lessee is closed? Should the Lessor be notified in some way?

  • CRTCs and Encoders have properties. Should these properties be automatically included in the lease?

    Proposed answer -- no, userspace is responsible for constructing the entire lease.

Posted Tue Mar 28 15:22:22 2017 Tags: tags/fdo

Consulting for Valve in my spare time

Valve Software has asked me to help work on a couple of Linux graphics issues, so I'll be doing a bit of consulting for them in my spare time. It should be an interesting diversion from my day job working for Hewlett Packard Enterprise on Memory Driven Computing and other fun things.

First thing on my plate is helping support head-mounted displays better by getting the window system out of the way. I spent some time talking with Dave Airlie and Eric Anholt about how this might work and have started on the kernel side of that. A brief synopsis is that we'll split off some of the output resources from the window system and hand them to the HMD compositor to perform mode setting and page flips.

After that, I'll be working out how to improve frame timing reporting back to games from a composited desktop under X. Right now, a game running on X with a compositing manager can't tell when each frame was shown, nor accurately predict when a new frame will be shown. This makes smooth animation rather difficult.

Posted Tue Mar 14 12:10:17 2017 Tags: tags/fdo

Wrapping libudev using LD_PRELOAD

Peter Hutterer and I were chasing down an X server bug which was exposed when running the libinput test suite against the X server with a separate thread for input. This was crashing deep inside libudev, which led us to suspect that libudev was getting run from multiple threads at the same time.

I figured I'd be able to tell by wrapping all of the libudev calls from the server and checking to make sure we weren't ever calling it from both threads at the same time. My first attempt was a simple set of cpp macros, but that failed when I discovered that libwacom was calling libgudev, which was calling libudev.

Instead of recompiling the world with my magic macros, I created a new library which exposes all of the (public) symbols in libudev. Each of these functions does a bit of checking and then simply calls down to the 'real' function.

Finding the real symbols

Here's the snippet which finds the real symbols:

static void *udev_symbol(const char *symbol)
    static void *libudev;
    static pthread_mutex_t  find_lock = PTHREAD_MUTEX_INITIALIZER;

    void *sym;
    if (!libudev) {
        libudev = dlopen("libudev.so.1.6.4", RTLD_LOCAL | RTLD_NOW);
    sym = dlsym(libudev, symbol);
    return sym;

Yeah, the libudev version is hard-coded into the source; I didn't want to accidentally load the wrong one. This could probably be improved...

Checking for re-entrancy

As mentioned above, we suspected that the bug was caused when libudev got called from two threads at the same time. So, our checks are pretty simple; we just count the number of calls into any udev function (to handle udev calling itself). If there are other calls in process, we make sure the thread ID for those is the same as the current thread.

static void udev_enter(const char *func) {
    assert (udev_running == 0 || udev_thread == pthread_self());
    udev_thread = pthread_self();
    udev_func[udev_running] = func;

static void udev_exit(void) {
    if (udev_running == 0)
    udev_thread = 0;
    udev_func[udev_running] = 0;

Wrapping functions

Now, the ugly part -- libudev exposes 93 different functions, with a wide variety of parameters and return types. I constructed a hacky macro, calls for which could be constructed pretty easily from the prototypes found in libudev.h, and which would construct our stub function:

#define make_func(type, name, formals, actuals)         \
    type name formals {                     \
    type ret;                       \
    static void *f;                     \
    if (!f)                         \
        f = udev_symbol(__func__);              \
    udev_enter(__func__);                   \
    ret = ((typeof (&name)) f) actuals;         \
    udev_exit();                        \
    return ret;                     \

There are 93 invocations of this macro (or a variant for void functions) which look much like:

make_func(struct udev *,
      (struct udev *udev),

Using udevwrap

To use udevwrap, simply stick the filename of the .so in LD_PRELOAD and run your program normally:

# LD_PRELOAD=/usr/local/lib/libudevwrap.so Xorg 

Source code

I stuck udevwrap in my git repository:


You can clone it using

$ git git://keithp.com/git/udevwrap
Posted Mon Aug 15 23:32:30 2016 Tags: tags/fdo

X.org Election Time — Vote Now

It's more important than usual to actually get your vote in — we're asking the membership to vote on changes the the X.org bylaws that are necessary for X.org to become a SPI affiliate project, instead of continuing on as a separate organization. While I'm in favor of this transition as I think it will provide much needed legal and financial help, the real reason we need everyone to vote is that we need ⅔ of the membership to cast ballots for the vote to be valid. Last time, we didn't reach that value, so even though we had a majority voting in favor of the change, it didn't take effect. If you aren't in favor of this change, I'd still encourage you to vote as I'd like to get a valid result, no matter the outcome.

Of course, we're also electing four members to the board. I'm happy to note that there are five candidates running for the four available seats, which shows that there are enough people willing to help serve the X.org community in this fashion.

Posted Tue Apr 12 23:51:37 2016 Tags: tags/fdo

Multi-Stream Transport 4k Monitors and X

I'm sure you've seen a 4k monitor on a friends desk running Mac OS X or Windows and are all ready to go get one so that you can use it under Linux.

Once you've managed to acquire one, I'm afraid you'll discover that when you plug it in, you're limited to 30Hz refresh rates at the full size, unless you're running a kernel that is version 3.17 or later. And then...

Good Grief! What Is My Computer Doing!

Ok, so now you're running version 3.17 and when X starts up, it's like you're using a gigantic version of Google Cardboard. Two copies of a very tall, but very narrow screen greets you.

Welcome to MST island.

In order to drive these giant new panels at full speed, there isn't enough bandwidth in the display hardware to individually paint each pixel once during each frame. So, like all good hardware engineers, they invented a clever hack.

This clever hack paints the screen in parallel. I'm assuming that they've got two bits of display hardware, each one hooked up to half of the monitor. Now, each paints only half of the pixels, avoiding costly redesign of expensive silicon, at least that's my surmise.

In the olden days, if you did this, you'd end up running two monitor cables to your computer, and potentially even having two video cards. Today, thanks to the magic of Display Port Multi-Stream Transport, we don't need all of that; instead, MST allows us to pack multiple cables-worth of data into a single cable.

I doubt the inventors of MST intended it to be used to split a single LCD panel into multiple "monitors", but hardware engineers are clever folk and are more than capable of abusing standards like this when it serves to save a buck.

Turning Two Back Into One

We've got lots of APIs that expose monitor information in the system, and across which we might be able to wave our magic abstraction wand to fix this:

  1. The KMS API. This is the kernel interface which is used by all graphics stuff, including user-space applications and the frame buffer console. Solve the problem here and it works everywhere automatically.

  2. The libdrm API. This is just the KMS ioctls wrapped in a simple C library. Fixing things here wouldn't make fbcons work, but would at least get all of the window systems working.

  3. Every 2D X driver. (Yeah, we're trying to replace all of these with the one true X driver). Fixing the problem here would mean that all X desktops would work. However, that's a lot of code to hack, so we'll skip this.

  4. The X server RandR code. More plausible than fixing every driver, this also makes X desktops work.

  5. The RandR library. If not in the X server itself, how about over in user space in the RandR protocol library? Well, the problem here is that we've now got two of them (Xlib and xcb), and the xcb one is auto-generated from the protocol descriptions. Not plausible.

  6. The Xinerama code in the X server. Xinerama is how we did multi-monitor stuff before RandR existed. These days, RandR provides Xinerama emulation, but we've been telling people to switch to RandR directly.

  7. Some new API. Awesome. Ok, so if we haven't fixed this in any existing API we control (kernel/libdrm/X.org), then we effectively dump the problem into the laps of the desktop and application developers. Given how long it's taken them to adopt current RandR stuff, providing yet another complication in their lives won't make them very happy.

All Our APIs Suck

Dave Airlie merged MST support into the kernel for version 3.17 in the simplest possible fashion -- pushing the problem out to user space. I was initially vaguely tempted to go poke at it and try to fix things there, but he eventually convinced me that it just wasn't feasible.

It turns out that all of our fancy new modesetting APIs describe the hardware in more detail than any application actually cares about. In particular, we expose a huge array of hardware objects:

  • Subconnectors
  • Connectors
  • Outputs
  • Video modes
  • Crtcs
  • Encoders

Each of these objects exposes intimate details about the underlying hardware -- which of them can work together, and which cannot; what kinds of limits are there on data rates and formats; and pixel-level timing details about blanking periods and refresh rates.

To make things work, some piece of code needs to actually hook things up, and explain to the user why the configuration they want just isn't possible.

The sticking point we reached was that when an MST monitor gets plugged in, it needs two CRTCs to drive it. If one of those is already in use by some other output, there's just no way you can steal it for MST mode.

Another problem -- we expose EDID data and actual video mode timings. Our MST monitor has two EDID blocks, one for each half. They happen to describe how they're related, and how you should configure them, but if we want to hide that from the application, we'll have to pull those EDID blocks apart and construct a new one. The same goes for video modes; we'll have to construct ones for MST mode.

Every single one of our APIs exposes enough of this information to be dangerous.

Every one, except Xinerama. All it talks about is a list of rectangles, each of which represents a logical view into the desktop. Did I mention we've been encouraging people to stop using this? And that some of them listened to us? Foolishly?

Dave's Tiling Property

Dave hacked up the X server to parse the EDID strings and communicate the layout information to clients through an output property. Then he hacked up the gnome code to parse that property and build a RandR configuration that would work.

Then, he changed to RandR Xinerama code to also parse the TILE properties and to fix up the data seen by application from that.

This works well enough to get a desktop running correctly, assuming that desktop uses Xinerama to fetch this data. Alas, gtk has been "fixed" to use RandR if you have RandR version 1.3 or later. No biscuit for us today.

Adding RandR Monitors

RandR doesn't have enough data types yet, so I decided that what we wanted to do was create another one; maybe that would solve this problem.

Ok, so what clients mostly want to know is which bits of the screen are going to be stuck together and should be treated as a single unit. With current RandR, that's some of the information included in a CRTC. You pull the pixel size out of the associated mode, physical size out of the associated outputs and the position from the CRTC itself.

Most of that information is available through Xinerama too; it's just missing physical sizes and any kind of labeling to help the user understand which monitor you're talking about.

The other problem with Xinerama is that it cannot be configured by clients; the existing RandR implementation constructs the Xinerama data directly from the RandR CRTC settings. Dave's Tiling property changes edit that data to reflect the union of associated monitors as a single Xinerama rectangle.

Allowing the Xinerama data to be configured by clients would fix our 4k MST monitor problem as well as solving the longstanding video wall, WiDi and VNC troubles. All of those want to create logical monitor areas within the screen under client control

What I've done is create a new RandR datatype, the "Monitor", which is a rectangular area of the screen which defines a rectangular region of the screen. Each monitor has the following data:

  • Name. This provides some way to identify the Monitor to the user. I'm using X atoms for this as it made a bunch of things easier.

  • Primary boolean. This indicates whether the monitor is to be considered the "primary" monitor, suitable for placing toolbars and menus.

  • Pixel geometry (x, y, width, height). These locate the region within the screen and define the pixel size.

  • Physical geometry (width-in-millimeters, height-in-millimeters). These let the user know how big the pixels will appear in this region.

  • List of outputs. (I think this is the clever bit)

There are three requests to define, delete and list monitors. And that's it.

Now, we want the list of monitors to completely describe the environment, and yet we don't want existing tools to break completely. So, we need some way to automatically construct monitors from the existing RandR state while still letting the user override portions of it as needed to explain virtual or tiled outputs.

So, what I did was to let the client specify a list of outputs for each monitor. All of the CRTCs which aren't associated with an output in any client-defined monitor are then added to the list of monitors reported back to clients. That means that clients need only define monitors for things they understand, and they can leave the other bits alone and the server will do something sensible.

The second tricky bit is that if you specify an empty rectangle at 0,0 for the pixel geometry, then the server will automatically compute the geometry using the list of outputs provided. That means that if any of those outputs get disabled or reconfigured, the Monitor associated with them will appear to change as well.

Current Status

Gtk+ has been switched to use RandR for RandR versions 1.3 or later. Locally, I hacked libXrandr to override the RandR version through an environment variable, set that to 1.2 and Gtk+ happily reverts back to Xinerama and things work fine. I suspect the plan here will be to have it use the new Monitors when present as those provide the same info that it was pulling out of RandR's CRTCs.

KDE appears to still use Xinerama data for this, so it "just works".

Where's the code

As usual, all of the code for this is in a collection of git repositories in my home directory on fd.o:

git://people.freedesktop.org/~keithp/randrproto master
git://people.freedesktop.org/~keithp/libXrandr master
git://people.freedesktop.org/~keithp/xrandr master
git://people.freedesktop.org/~keithp/xserver randr-monitors

RandR protocol changes

Here's the new sections added to randrproto.txt


1.5. Introduction to version 1.5 of the extension

Version 1.5 adds monitors

 • A 'Monitor' is a rectangular subset of the screen which represents
   a coherent collection of pixels presented to the user.

 • Each Monitor is be associated with a list of outputs (which may be

 • When clients define monitors, the associated outputs are removed from
   existing Monitors. If removing the output causes the list for that
   monitor to become empty, that monitor will be deleted.

 • For active CRTCs that have no output associated with any
   client-defined Monitor, one server-defined monitor will
   automatically be defined of the first Output associated with them.

 • When defining a monitor, setting the geometry to all zeros will
   cause that monitor to dynamically track the bounding box of the
   active outputs associated with them

This new object separates the physical configuration of the hardware
from the logical subsets  the screen that applications should
consider as single viewable areas.

1.5.1. Relationship between Monitors and Xinerama

Xinerama's information now comes from the Monitors instead of directly
from the CRTCs. The Monitor marked as Primary will be listed first.


5.6. Protocol Types added in version 1.5 of the extension

          primary: BOOL
          automatic: BOOL
          x: INT16
          y: INT16
          width: CARD16
          height: CARD16
          width-in-millimeters: CARD32
          height-in-millimeters: CARD32
          outputs: LISTofOUTPUT }


7.5. Extension Requests added in version 1.5 of the extension.

    window : WINDOW
    timestamp: TIMESTAMP
    monitors: LISTofMONITORINFO
    Errors: Window

    Returns the list of Monitors for the screen containing

    'timestamp' indicates the server time when the list of
    monitors last changed.

    window : WINDOW
    Errors: Window, Output, Atom, Value

    Create a new monitor. Any existing Monitor of the same name is deleted.

    'name' must be a valid atom or an Atom error results.

    'name' must not match the name of any Output on the screen, or
    a Value error results.

    If 'info.outputs' is non-empty, and if x, y, width, height are all
    zero, then the Monitor geometry will be dynamically defined to
    be the bounding box of the geometry of the active CRTCs
    associated with them.

    If 'name' matches an existing Monitor on the screen, the
    existing one will be deleted as if RRDeleteMonitor were called.

    For each output in 'info.outputs, each one is removed from all
    pre-existing Monitors. If removing the output causes the list of
    outputs for that Monitor to become empty, then that Monitor will
    be deleted as if RRDeleteMonitor were called.

    Only one monitor per screen may be primary. If 'info.primary'
    is true, then the primary value will be set to false on all
    other monitors on the screen.

    RRSetMonitor generates a ConfigureNotify event on the root
    window of the screen.

    window : WINDOW
    name: ATOM
    Errors: Window, Atom, Value

    Deletes the named Monitor.

    'name' must be a valid atom or an Atom error results.

    'name' must match the name of a Monitor on the screen, or a
    Value error results.

    RRDeleteMonitor generates a ConfigureNotify event on the root
    window of the screen.

Posted Wed Dec 17 01:36:43 2014 Tags: tags/fdo

Present and Compositors

The current Present extension is pretty unfriendly to compositing managers, causing an extra frame of latency between the applications operation and the scanout buffer. Here's how I'm fixing that.

An extra frame of lag

When an application uses PresentPixmap, that operation is generally delayed until the next vblank interval. When using X without composting, this ensures that the operation will get started in the vblank interval, and, if the rendering operation is quick enough, you'll get the frame presented without any tearing.

When using a compositing manager, the operation is still delayed until the vblank interval. That means that the CopyArea and subsequent Damage event generation don't occur until the display has already started the next frame. The compositing manager receives the damage event and constructs a new frame, but it also wants to avoid tearing, so that frame won't get displayed immediately, instead it'll get delayed until the next frame, introducing the lag.

Copy now, complete later

While away from the keyboard this morning, I had a sudden idea -- what if we performed the CopyArea and generated Damage right when the PresentPixmap request was executed but delayed the PresentComplete event until vblank happened.

With the contents updated and damage delivered, the compositing manager can immediately start constructing a new scene for the upcoming frame. When that is complete, it can also use PresentPixmap (either directly or through OpenGL) to queue the screen update.

If it's fast enough, that will all happen before vblank and the application contents will actually appear at the desired time.

Now, at the appointed vblank time, the PresentComplete event will get delivered to the client, telling it that the operation has finished and that its contents are now on the screen. If the compositing manager was quick, this event won't even be a lie.

We'll be lying less often

Right now, the CopyArea, Damage and PresentComplete operations all happen after the vblank has passed. As the compositing manager delays the screen update until the next vblank, then every single PresentComplete event will have the wrong UST/MSC values in it.

With the CopyArea happening immediately, we've a pretty good chance that the compositing manager will get the application contents up on the screen at the target time. When this happens, the PresentComplete event will have the correct values in it.

How can we do better?

The only way to do better is to have the PresentComplete event generated when the compositing manager displays the frame. I've talked about how that should work, but it's a bit twisty, and will require changes in the compositing manager to report the association between their PresentPixmap request and the applications' PresentPixmap requests.

Where's the code

I've got a set of three patches, two of which restructure the existing code without changing any behavior and a final patch which adds this improvement. Comments and review are encouraged, as always!

git://people.freedesktop.org/~keithp/xserver.git present-compositor
Posted Sat Dec 13 00:28:06 2014 Tags: tags/fdo

Glamor Cleanup

Before I start really digging in to reworking the Render support in Glamor, I wanted to take a stab at cleaning up some cruft which has accumulated in Glamor over the years. Here's what I've done so far.

Get rid of the Intel fallback paths

I think it's my fault, and I'm sorry.

The original Intel Glamor code has Glamor implement accelerated operations using GL, and when those fail, the Intel driver would fall back to its existing code, either UXA acceleration or software. Note that it wasn't Glamor doing these fallbacks, instead the Intel driver had a complete wrapper around every rendering API, calling special Glamor entry points which would return FALSE if GL couldn't accelerate the specified operation.

The thinking was that when GL couldn't do something, it would be far faster to take advantage of the existing UXA paths than to have Glamor fall back to pulling the bits out of GL, drawing to temporary images with software, and pushing the bits back to GL.

And, that may well be true, but what we've managed to prove is that there really aren't any interesting rendering paths which GL can't do directly. For core X, the only fallbacks we have today are for operations using a weird planemask, and some CopyPlane operations. For Render, essentially everything can be accelerated with the GPU.

At this point, the old Intel Glamor implementation is a lot of ugly code in Glamor without any use. I posted patches to the Intel driver several months ago which fix the Glamor bits there, but they haven't seen any review yet and so they haven't been merged, although I've been running them since 1.16 was released...

Getting rid of this support let me eliminate all of the _nf functions exported from Glamor, along with the GLAMOR_USE_SCREEN and GLAMOR_USE_PICTURE_SCREEN parameters, along with the GLAMOR_SEPARATE_TEXTURE pixmap type.

Force all pixmaps to have exact allocations

Glamor has a cache of recently used textures that it uses to avoid allocating and de-allocating GL textures rapidly. For pixmaps small enough to fit in a single texture, Glamor would use a cache texture that was larger than the pixmap.

I disabled this when I rewrote the Glamor rendering code for core X; that code used texture repeat modes for tiles and stipples; if the texture wasn't the same size as the pixmap, then texturing would fail.

On the Render side, Glamor would actually reallocate pixmaps used as repeating texture sources. I could have fixed up the core rendering code to use this, but I decided instead to just simplify things and eliminate the ability to use larger textures for pixmaps everywhere.

Remove redundant pixmap and screen private pointers

Every Glamor pixmap private structure had a pointer back to the pixmap it was allocated for, along with a pointer to the the Glamor screen private structure for the related screen. There's no particularly good reason for this, other than making it possible to pass just the Glamor pixmap private around a lot of places. So, I removed those pointers and fixed up the functions to take the necessary extra or replaced parameters.

Similarly, every Glamor fbo had a pointer back to the Glamor screen private too; I removed that and now pass the Glamor screen private parameter as needed.

Reducing pixmap private complexity

Glamor had three separate kinds of pixmap private structures, one for 'normal' pixmaps (those allocated by them selves in a single FBO), one for 'large' pixmaps, where the pixmap was tiled across many FBOs, and a third for 'atlas' pixmaps, which presumably would be a single FBO holding multiple pixmaps.

The 'atlas' form was never actually implemented, so it was pretty easy to get rid of that.

For large vs normal pixmaps, the solution was to move the extra data needed by large pixmaps into the same structure as that used by normal pixmaps and simply initialize those elements correctly in all cases. Now, most code can ignore the difference and simply walk the array of FBOs as necessary.

The other thing I did was to shrink the number of possible pixmap types from 8 down to three. Glamor now exposes just these possible pixmap types:

  • GLAMOR_MEMORY. This is a software-only pixmap, stored in regular memory and only drawn with software. This is used for 1bpp pixmaps, shared memory pixmaps and glyph pixmaps. Most of the time, these pixmaps won't even get a Glamor pixmap private structure allocated, but if you use one of these with the existing Render acceleration code, that will end up wanting a private pointer. I'm hoping to fix the code so we can just use a NULL private to indicate this kind of pixmap.

  • GLAMOR_TEXTURE. This is a full Glamor pixmap, capable of being used via either GL or software fallbacks.

  • GLAMOR_DRM_ONLY. This is a pixmap based on an FBO which was passed from the driver, and for which Glamor couldn't get the underlying DRM object. I think this is an error, but I don't quite understand what's going on here yet...

Future Work

  • Deal with X vs GL color formats
  • Finish my new CompositeGlyphs code
  • Create pure shader-based gradients
  • Rewrite Composite to use the GPU for more computation
  • Take another stab at doing GPU-accelerated trapezoids
Posted Thu Oct 30 00:51:39 2014 Tags: tags/fdo

Chromium (the browser) and DRI3

I got a note on IRC a week ago that Chromium was crashing with DRI3.

The Google team working on Chromium eventually sent me a link to the bug report. That's secret Google stuff, so you won't be able to follow the link, even though it's a bug in a free software application when running on free software drivers.

There's a bug report in the freedesktop bugzilla which looks the same to me.

In both cases, the recommended “fix” was to switch from DRI3 back to DRI2. That's not exactly a great plan, given that DRI3 offers better security between GPU-using applications, which seems like a pretty nice thing to have when you're running random GL applications from the web.

Chromium Sandboxing

I'm not entirely sure how it works, but Chromium creates a process separate from the main browser engine to talk to the GPU. That process has very limited access to the operating system via some fancy library adventures. Presumably, the hope is that security bugs in the GL driver would be harder to leverage into a remote system exploit.

Debugging in this environment is a bit tricky as you can't simply run chromium under gdb and expect to be able to set breakpoints in the GL driver. Instead, you have to run chromium with a magic flag which causes the GPU process to pause before loading the driver so you can connect to it with gdb and debug from there, along with a flag that lets you see crashes within the gpu process and the usual flag that causes chromium to ignore the GPU black list which seems to always include the Intel driver for one reason or another:

$ chromium --gpu-startup-dialog --disable-gpu-watchdog --ignore-gpu-blacklist

Once Chromium starts up, it will print out a message telling you to attach gdb to the GPU process and send that process a SIGUSR1 to continue it. Now you can happily debug and get a stack trace when the crash occurs.

Locating the Bug

The bug manifested with a segfault at the first access to a DRI3-allocated buffer within the application. We've seen this problem in the past; whenever buffer allocation fails for some reason, the driver ignores the problem and attempts to de-reference through the (NULL) buffer pointer, causing a segfault. In this case, Chromium called glClear, which tried (and failed) to allocate a back buffer causing the i965 driver to subsequently segfault.

We should probably go fix the i965 driver to not segfault when buffer allocation fails, but that wouldn't provide a lot of additional information. What I have done is add some error messages in the DRI3 buffer allocation path which at least tell you why the buffer allocation failed. That patch has been merged to Mesa master, and should also get merged to the Mesa stable branch for the next stable release.

Once I had added the error messages, it was pretty easy to see what happened:

$ chromium --ignore-gpu-blacklist
[10618:10643:0930/200525:ERROR:nss_util.cc(856)] After loading Root Certs, loaded==false: NSS error code: -8018
libGL: pci id for fd 12: 8086:0a16, driver i965
libGL: OpenDriver: trying /local-miki/src/mesa/mesa/lib/i965_dri.so
libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted.
libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted.
libGL error: DRI3 Fence object allocation failure Operation not permitted

The first two errors were just the sandbox preventing Mesa from using my GL configuration file. I'm not sure how that's a security problem, but it shouldn't harm the driver much.

The last error is where the problem lies. In Mesa, the DRI3 implementation uses a chunk of shared memory to hold a fence object that lets Mesa know when buffers are idle without using the X connection. That shared memory segment is allocated by creating a temporary file using the O_TMPFILE flag:

fd = open("/dev/shm", O_TMPFILE|O_RDWR|O_CLOEXEC|O_EXCL, 0666);

This call “cannot fail” as /dev/shm is used by glibc for shared memory objects, and must therefore be world writable on any glibc system. However, with the Chromium sandbox enabled, it returns EPERM.

Running Without a Sandbox

Now that the bug appears to be in the sandboxing code, we can re-test with the GPU sandbox disabled:

$ chromium --ignore-gpu-blacklist --disable-gpu-sandbox

And, indeed, without the sandbox getting in the way of allocating a shared memory segment, Chromium appears happy to use the Intel driver with DRI3.

Final Thoughts

I looked briefly at the Chromium sandbox code. It looks like it needs to know intimate details of the OpenGL implementation for every possible driver it runs on; it seems to contain a fixed list of all possible files and modes that the driver will pass to open(2). That seems incredibly fragile to me, especially when used in a general Linux desktop environment. Minor changes in how the GL driver operates can easily cause the browser to stop working.

Posted Tue Sep 30 23:51:25 2014 Tags: tags/fdo