Ok, so I didn't get a lot of time for coding last week. And, this week there's OSCON, so coding time will be short again. I figured I should spend some time writing up a brief report about where X output is at today.
Output Hotplug
I think this stuff is fairly solid these days, although we don't have much in the way of auto-detection of monitor connect/disconnect. There are two reasons here:
The hardware notifies the operating system via an interrupt. Given mode setting code in user space, dealing with interrupts is a huge pain and hence hasn't been hooked up yet (see below).
Analog outputs (VGA, TV) do detection using impedance changes in the output signal path. This means we have to keep them active if we want to detect a connection. That takes a lot of power (about 1W to light up the VGA output without a monitor connected). What we could do is detect when a monitor was unplugged; that's free.
There are a few other random improvements that are coming soon, like CEA additions to the EDID parsing code. These are additional data blocks that follow the standard EDID data and are used for 'consumer electronics' devices. Supporting these should make more HDMI monitors 'just work'.
Initial Mode Selection
Detecting connected monitors is fine, but one thing we haven't really solved is what to do when you have more than one connected when the server starts. My initial code would pick one 'primary' monitor, light that up at its preferred size and then pick modes for the other monitors which were as close as possible to the primary monitor size without being larger. Obviously, I liked that as it meant my laptop always came up looking correct on the LVDS and my external VGA would show most of the screen.
However, this was reported to confuse a lot of users. I can imagine that starting the X server with one of the outputs connected but not turned on would make for some 'interesting' support calls. So, now the X server picks a mode which all outputs can support and uses that everywhere. Sadly, this means that my laptop panel gets some random scaled mode (usually 1024x768) which looks quite awful.
I think we need something better than either of these choices, but I'm not quite sure how it should work.
Kernel Mode Setting
A bunch of people, including Jesse Barnes and Dave Airlie, have been hacking to move the output configuration code into the kernel. This will solve lots of little problems, like how to display kernel panic messages, and how to deal with interrupts for output hotplug.
This code is up and running fairly well these days, but depends on a kernel memory manager to deal with frame buffers. The integration of GEM into the kernel is blocking this work, but I'm hopeful that this will be sorted out in the next couple of weeks.
GEM -- the Graphics Execution Manager
Work here was stalled for a few weeks while we sorted out memory channel interleaving issues. Now things are moving again, and we're working on getting it stable enough to merge into master. That means fixing a few more critical bugs that the Intel QA team has identified.
One of these bugs is that our GL conformance tests weren't working right; that turned out to be caused by tests reading back data from the frame buffer one pixel at a time. Our read-back path passed through the GEM memory domain code to pull objects back from GTT space to CPU space. That meant flushing the front, back and depth buffers from the CPU cache. With each of those at 16MB, reading a single pixel took long enough that the tests would time-out. Increasing the timeouts to 'way too long' is making them run, but tests which would complete in a few hours are now taking days.
We've got two different plans for fixing the read-back path:
Use pread to access precisely the data we need. This would involve flushing a single cache line for the tests above.
Mapping the back buffer through the GTT. This would eliminate the need to clflush anything as the GTT mappings are write combining and so reads bypass the cache.
Eric is working on the former, and I'm working on the latter. More news later (this week?) when we see which one wins.
Composite Acceleration
With Owen Taylor's change to the glyph management code in the server, Eric and Carl were able to change the driver to batch multiple glyph drawing operations into a command buffer. Once Carl had this working, we went from 13000 glyphs/sec to 103000 glyphs/sec. Obviously we're hoping for even larger improvements as a pure software solution is well over 1 million glyphs/sec. Even still, 103000 glyphs/sec is enough to make my desktop vastly more usable, and using the software path means losing a lot of other useful acceleration.
DRI2 -- Redirected Direct Rendering
Right now, direct rendered GL applications (which is the fastest way we can do GL at present) get drawn to a giant screen-sized back buffer and then copied from there to the screen at swap buffers time. Because everyone shares the same back buffer, you get to clip your drawing as if you were drawing directly to the screen. While this normally doesn't matter much (aside from some performance costs associated with lots of clip rectangles), when you're running a compositing manager (like compiz), the 3D applications end up ignoring the per-window offscreen pixmap and spam their output directly to the real frame buffer.
DRI2, written by Kristen Høgsberg, solves this by changing how direct rendering works and giving everyone a private back-buffer to draw to. Now, at buffer swap time, that private back-buffer can be copied to the window's pixmap and compiz is happy.
This work has been around for a few months, but depends on a TTM-based memory manager. That dependency isn't very strong, and krh has promised to fix it shortly. Once that's done, getting the GEM driver to support DRI2 won't take long, and we'll have our fully composited desktop running. With luck, that'll happen before September.
Final Words
As you can see, we're nearing the end of our long X output rework saga, with most of the pieces falling into place in the next month or two.