Multi-Stream Transport 4k Monitors and X

I'm sure you've seen a 4k monitor on a friends desk running Mac OS X or Windows and are all ready to go get one so that you can use it under Linux.

Once you've managed to acquire one, I'm afraid you'll discover that when you plug it in, you're limited to 30Hz refresh rates at the full size, unless you're running a kernel that is version 3.17 or later. And then...

Good Grief! What Is My Computer Doing!

Ok, so now you're running version 3.17 and when X starts up, it's like you're using a gigantic version of Google Cardboard. Two copies of a very tall, but very narrow screen greets you.

Welcome to MST island.

In order to drive these giant new panels at full speed, there isn't enough bandwidth in the display hardware to individually paint each pixel once during each frame. So, like all good hardware engineers, they invented a clever hack.

This clever hack paints the screen in parallel. I'm assuming that they've got two bits of display hardware, each one hooked up to half of the monitor. Now, each paints only half of the pixels, avoiding costly redesign of expensive silicon, at least that's my surmise.

In the olden days, if you did this, you'd end up running two monitor cables to your computer, and potentially even having two video cards. Today, thanks to the magic of Display Port Multi-Stream Transport, we don't need all of that; instead, MST allows us to pack multiple cables-worth of data into a single cable.

I doubt the inventors of MST intended it to be used to split a single LCD panel into multiple "monitors", but hardware engineers are clever folk and are more than capable of abusing standards like this when it serves to save a buck.

Turning Two Back Into One

We've got lots of APIs that expose monitor information in the system, and across which we might be able to wave our magic abstraction wand to fix this:

  1. The KMS API. This is the kernel interface which is used by all graphics stuff, including user-space applications and the frame buffer console. Solve the problem here and it works everywhere automatically.

  2. The libdrm API. This is just the KMS ioctls wrapped in a simple C library. Fixing things here wouldn't make fbcons work, but would at least get all of the window systems working.

  3. Every 2D X driver. (Yeah, we're trying to replace all of these with the one true X driver). Fixing the problem here would mean that all X desktops would work. However, that's a lot of code to hack, so we'll skip this.

  4. The X server RandR code. More plausible than fixing every driver, this also makes X desktops work.

  5. The RandR library. If not in the X server itself, how about over in user space in the RandR protocol library? Well, the problem here is that we've now got two of them (Xlib and xcb), and the xcb one is auto-generated from the protocol descriptions. Not plausible.

  6. The Xinerama code in the X server. Xinerama is how we did multi-monitor stuff before RandR existed. These days, RandR provides Xinerama emulation, but we've been telling people to switch to RandR directly.

  7. Some new API. Awesome. Ok, so if we haven't fixed this in any existing API we control (kernel/libdrm/X.org), then we effectively dump the problem into the laps of the desktop and application developers. Given how long it's taken them to adopt current RandR stuff, providing yet another complication in their lives won't make them very happy.

All Our APIs Suck

Dave Airlie merged MST support into the kernel for version 3.17 in the simplest possible fashion -- pushing the problem out to user space. I was initially vaguely tempted to go poke at it and try to fix things there, but he eventually convinced me that it just wasn't feasible.

It turns out that all of our fancy new modesetting APIs describe the hardware in more detail than any application actually cares about. In particular, we expose a huge array of hardware objects:

  • Subconnectors
  • Connectors
  • Outputs
  • Video modes
  • Crtcs
  • Encoders

Each of these objects exposes intimate details about the underlying hardware -- which of them can work together, and which cannot; what kinds of limits are there on data rates and formats; and pixel-level timing details about blanking periods and refresh rates.

To make things work, some piece of code needs to actually hook things up, and explain to the user why the configuration they want just isn't possible.

The sticking point we reached was that when an MST monitor gets plugged in, it needs two CRTCs to drive it. If one of those is already in use by some other output, there's just no way you can steal it for MST mode.

Another problem -- we expose EDID data and actual video mode timings. Our MST monitor has two EDID blocks, one for each half. They happen to describe how they're related, and how you should configure them, but if we want to hide that from the application, we'll have to pull those EDID blocks apart and construct a new one. The same goes for video modes; we'll have to construct ones for MST mode.

Every single one of our APIs exposes enough of this information to be dangerous.

Every one, except Xinerama. All it talks about is a list of rectangles, each of which represents a logical view into the desktop. Did I mention we've been encouraging people to stop using this? And that some of them listened to us? Foolishly?

Dave's Tiling Property

Dave hacked up the X server to parse the EDID strings and communicate the layout information to clients through an output property. Then he hacked up the gnome code to parse that property and build a RandR configuration that would work.

Then, he changed to RandR Xinerama code to also parse the TILE properties and to fix up the data seen by application from that.

This works well enough to get a desktop running correctly, assuming that desktop uses Xinerama to fetch this data. Alas, gtk has been "fixed" to use RandR if you have RandR version 1.3 or later. No biscuit for us today.

Adding RandR Monitors

RandR doesn't have enough data types yet, so I decided that what we wanted to do was create another one; maybe that would solve this problem.

Ok, so what clients mostly want to know is which bits of the screen are going to be stuck together and should be treated as a single unit. With current RandR, that's some of the information included in a CRTC. You pull the pixel size out of the associated mode, physical size out of the associated outputs and the position from the CRTC itself.

Most of that information is available through Xinerama too; it's just missing physical sizes and any kind of labeling to help the user understand which monitor you're talking about.

The other problem with Xinerama is that it cannot be configured by clients; the existing RandR implementation constructs the Xinerama data directly from the RandR CRTC settings. Dave's Tiling property changes edit that data to reflect the union of associated monitors as a single Xinerama rectangle.

Allowing the Xinerama data to be configured by clients would fix our 4k MST monitor problem as well as solving the longstanding video wall, WiDi and VNC troubles. All of those want to create logical monitor areas within the screen under client control

What I've done is create a new RandR datatype, the "Monitor", which is a rectangular area of the screen which defines a rectangular region of the screen. Each monitor has the following data:

  • Name. This provides some way to identify the Monitor to the user. I'm using X atoms for this as it made a bunch of things easier.

  • Primary boolean. This indicates whether the monitor is to be considered the "primary" monitor, suitable for placing toolbars and menus.

  • Pixel geometry (x, y, width, height). These locate the region within the screen and define the pixel size.

  • Physical geometry (width-in-millimeters, height-in-millimeters). These let the user know how big the pixels will appear in this region.

  • List of outputs. (I think this is the clever bit)

There are three requests to define, delete and list monitors. And that's it.

Now, we want the list of monitors to completely describe the environment, and yet we don't want existing tools to break completely. So, we need some way to automatically construct monitors from the existing RandR state while still letting the user override portions of it as needed to explain virtual or tiled outputs.

So, what I did was to let the client specify a list of outputs for each monitor. All of the CRTCs which aren't associated with an output in any client-defined monitor are then added to the list of monitors reported back to clients. That means that clients need only define monitors for things they understand, and they can leave the other bits alone and the server will do something sensible.

The second tricky bit is that if you specify an empty rectangle at 0,0 for the pixel geometry, then the server will automatically compute the geometry using the list of outputs provided. That means that if any of those outputs get disabled or reconfigured, the Monitor associated with them will appear to change as well.

Current Status

Gtk+ has been switched to use RandR for RandR versions 1.3 or later. Locally, I hacked libXrandr to override the RandR version through an environment variable, set that to 1.2 and Gtk+ happily reverts back to Xinerama and things work fine. I suspect the plan here will be to have it use the new Monitors when present as those provide the same info that it was pulling out of RandR's CRTCs.

KDE appears to still use Xinerama data for this, so it "just works".

Where's the code

As usual, all of the code for this is in a collection of git repositories in my home directory on fd.o:

git://people.freedesktop.org/~keithp/randrproto master
git://people.freedesktop.org/~keithp/libXrandr master
git://people.freedesktop.org/~keithp/xrandr master
git://people.freedesktop.org/~keithp/xserver randr-monitors

RandR protocol changes

Here's the new sections added to randrproto.txt

                  ❧❧❧❧❧❧❧❧❧❧❧

1.5. Introduction to version 1.5 of the extension

Version 1.5 adds monitors

 • A 'Monitor' is a rectangular subset of the screen which represents
   a coherent collection of pixels presented to the user.

 • Each Monitor is be associated with a list of outputs (which may be
   empty).

 • When clients define monitors, the associated outputs are removed from
   existing Monitors. If removing the output causes the list for that
   monitor to become empty, that monitor will be deleted.

 • For active CRTCs that have no output associated with any
   client-defined Monitor, one server-defined monitor will
   automatically be defined of the first Output associated with them.

 • When defining a monitor, setting the geometry to all zeros will
   cause that monitor to dynamically track the bounding box of the
   active outputs associated with them

This new object separates the physical configuration of the hardware
from the logical subsets  the screen that applications should
consider as single viewable areas.

1.5.1. Relationship between Monitors and Xinerama

Xinerama's information now comes from the Monitors instead of directly
from the CRTCs. The Monitor marked as Primary will be listed first.

                  ❧❧❧❧❧❧❧❧❧❧❧

5.6. Protocol Types added in version 1.5 of the extension

MONITORINFO { name: ATOM
          primary: BOOL
          automatic: BOOL
          x: INT16
          y: INT16
          width: CARD16
          height: CARD16
          width-in-millimeters: CARD32
          height-in-millimeters: CARD32
          outputs: LISTofOUTPUT }

                  ❧❧❧❧❧❧❧❧❧❧❧

7.5. Extension Requests added in version 1.5 of the extension.

┌───
    RRGetMonitors
    window : WINDOW
     ▶
    timestamp: TIMESTAMP
    monitors: LISTofMONITORINFO
└───
    Errors: Window

    Returns the list of Monitors for the screen containing
    'window'.

    'timestamp' indicates the server time when the list of
    monitors last changed.

┌───
    RRSetMonitor
    window : WINDOW
    info: MONITORINFO
└───
    Errors: Window, Output, Atom, Value

    Create a new monitor. Any existing Monitor of the same name is deleted.

    'name' must be a valid atom or an Atom error results.

    'name' must not match the name of any Output on the screen, or
    a Value error results.

    If 'info.outputs' is non-empty, and if x, y, width, height are all
    zero, then the Monitor geometry will be dynamically defined to
    be the bounding box of the geometry of the active CRTCs
    associated with them.

    If 'name' matches an existing Monitor on the screen, the
    existing one will be deleted as if RRDeleteMonitor were called.

    For each output in 'info.outputs, each one is removed from all
    pre-existing Monitors. If removing the output causes the list of
    outputs for that Monitor to become empty, then that Monitor will
    be deleted as if RRDeleteMonitor were called.

    Only one monitor per screen may be primary. If 'info.primary'
    is true, then the primary value will be set to false on all
    other monitors on the screen.

    RRSetMonitor generates a ConfigureNotify event on the root
    window of the screen.

┌───
    RRDeleteMonitor
    window : WINDOW
    name: ATOM
└───
    Errors: Window, Atom, Value

    Deletes the named Monitor.

    'name' must be a valid atom or an Atom error results.

    'name' must match the name of a Monitor on the screen, or a
    Value error results.

    RRDeleteMonitor generates a ConfigureNotify event on the root
    window of the screen.

                  ❧❧❧❧❧❧❧❧❧❧❧