Zack Rusin: Video and other APIs

Monday, February 09, 2009

Video and other APIs

I read today a short article about video acceleration in X. First of all I was already unhappy because I had to read. Personally I think all the news sites should come in a comic form. With as few captions as possible. Just like Perl I'm a write-only entity. Then I read that Gallium isn't an option when it comes to video acceleration because it exposes programmable pipeline and I got dizzy.

So I got off the carousel and I thought to myself: "Well, that's wrong", and apparently no one else got that. Clearly the whole "we're connected" thing is a lie because it doesn't matter how vividly I think about stuff, others still don't see what's in my head (although to the person who sent me cheese when I was in Norway - that was extremely well done). So cumbersome as it might be I'm doing the whole "writing my thoughts down" thing. Look at me! I'm writing! No hands! sdsdfewwwfr vnjbnhm nhn. Hands back on.

I think the confusion stems from the fact that the main interface in Gallium is the context interface, which does in fact model the programmable pipeline. Because of the incredibly flexible nature of the programmable pipeline a huge set of APIs is covered just by reusing the context interface. But in modern GPUs there are still some fixed function parts that are not easily addressable by the programmable pipeline interface. Video is a great example of that. To a lesser degree so is basic 2D acceleration (lesser because some of the modern GPUs don't have 2D engines at all anymore).

But, and it's a big but ("and I can not lie" <- song reference, pointing out to let everyone know that I'm all about music) nothing stops us from adding interfaces which deal exclusively with the fixed function parts of modern GPUs. In fact it has already been done as the work on a simple 2D interface has already started.

Basic idea is that the state trackers which need some specific functionality use the given interface. For example the Exa state tracker would use the Gallium 2d interface, instead of the main context interface. In this case the Gallium hardware driver will have a choice: it can either implement the given interface directly in hardware, or it can use the default implementation.

The default implementation is something Gallium will provide as part of the auxiliary libraries. The default implementation will use the main context interface to emulate the entire functionality of the other interface.

Video decoding framework would use the same semantics. So it would be an additional interface(s) with default implementation on top of the 3D pipeline. Obviously some parts of the video support are quite difficult to implement on top of the 3D pipeline but the whole point of this is that: for hardware that supports it you get the whole shabangabang, for hardware that doesn't you get a reasonable fallback. Plus in the latter case the driver authors don't have to write a single line of hardware specific code.

So a very nice project for someone would be to take VDPAU, VA-API or any video framework of your choice and implement a state tracker for that API on top of Gallium and design an interface(s) that could be added to Gallium to implement the API in a way that makes full usage of the fixed functionality video units found in the GPUs. I think this is the way our XvMC state tracker is heading.
This is the moment where we break into a song.

5 comments:

bestouff said...: Yeah, Michael isn't that sharp when it comes to reporting tech news :); February 9, 2009 at 2:24 PM
Anonymous said...: It seems many people like your writing style, except maybe for those disputing the revolutionary effect of wireless pancakes. You know who you are. Still, and this is just my humble opinion so feel free to discard it (but not in a hateful manner, more in the spirit of "Oh wow this is another nice one but unfortunately I'm running out of space for humble opinions, so you have to go."), it would be helpful if you could stop inserting medium-entertaining stories in your stories, like the one where you ended up with Paul Adams in a bathtub full with peanut butter, oh man... all these side notes make it kinda hard to figure out what you're trying to say. At least, I'd love (as in, be totally crazy for) to be able to do the whole information decomposition stuff (yeargs) with your blogs, as in: distinguish what's the interesting stuff and what's just random burps which are uninteresting to humor-resistant turds like me.

It's a little painful to scan through all the noise to find the interesting parts.; February 9, 2009 at 4:01 PM
Younes Manton said...: That's what I had in mind with the video related extensions I brought up on the list--a pipe_video_context interface that can be implemented by fixed hardware, and some fallbacks in the auxilliary dir that do their job completely in software or using the 2D or 3D context, that the driver can use if it doesn't have the necessary hardware for a particular format or decoding stage.

Most of the stuff in the current XvMC state tracker would just be shuffled off into the auxilliary dir as a pipe_context-based MPEG2 fallback.; February 9, 2009 at 4:16 PM
Anonymous said...: >It's a little painful to scan through all the noise to find the interesting parts.

It's not noise, it's music! You didn't get ANYTHING.; February 9, 2009 at 10:01 PM
Anonymous said...: The fact that we all keep coming back to read Zack's blog despite the "noise" is proof that underneath it contains some excellent information. It's amazing how many Qt devs admit to checking this blog daily for updates...

Anyway, I'm a little hesitant about the "Gallium can handle video because you can extend it to handle video" position. You can do anything in software; any framework can be extended. The question is, what value does the Gallium framework add for video processing?

I can see why adding a video API would be beneficial for Tungsten, but why would an independent dev undertake the work?; February 14, 2009 at 12:51 AM