Two facts, one you didn't need to know, other that you should know. I'm the World Champion in being Vegetarian. Did you know that? Of course not, you probably didn't even know it's a competition. But it is. And I won it. My favorite vegetable? Nachos.
Technically they're a fourth of a fifth generation vegetable. But vegetables are like software, you never want to work with them on the first iteration. Like you wouldn't want to eat corn, but mash it, broil/fry it and you got something magical going on in the form of tortilla chips.
The other fact: OpenGL ES state trackers for Gallium just went public. That includes both OpenGL ES 1.x and 2.x. Brian just pushed them into opengl-es branch. Together with OpenVG they should be part of the Mesa3D 7.6 release.
At this point Mesa3D almost becomes the Khronos SDK and hopefully soon we'll add an OpenCL state tracker and start working on bumping the OpenGL code to version 3.1.
Friday, May 15, 2009
Sunday, May 10, 2009
OpenVG Release
I've been procrastinating lately. Where I define procrastination as: doing actual work, instead of writing blog entries. "Can you do that? Redefine words because you feel like it?", oh, sure you can. As long as you don't care about anyone understanding what you're saying, then it's perfectly fine. In fact it's good for you, the less people understands you, the more likely it is that they'll classify you as a genius. Life is awesome like that. That's right, you come here for technical stuff and get blessed with some serious soul searching stuff, enjoy.
We have released the OpenVG state tracker for Gallium.
It's a complete implementation of OpenVG 1.0. While my opinion on 2D graphics and long term usefulness of OpenVG changed drastically during the last year I'm very happy we were able to release the code. After the Mesa3D 7.5 we're going to merge it into master and 7.6 will be the first release that includes both OpenGL and OpenVG.

[You're exuberant about that]. I figured I'll start including stage directions for the blog so that you know how you feel about reading this.
I don't have much to write about the implementation itself. If you have a Gallium driver, then you're good to go and have accelerated 2d and 3d. It's great to see more state trackers being released for Gallium and this whole concept of "accelerating multiple APIs" on top of one driver architecture becoming a reality. [You love it and with a smile, wave bye]
We have released the OpenVG state tracker for Gallium.
It's a complete implementation of OpenVG 1.0. While my opinion on 2D graphics and long term usefulness of OpenVG changed drastically during the last year I'm very happy we were able to release the code. After the Mesa3D 7.5 we're going to merge it into master and 7.6 will be the first release that includes both OpenGL and OpenVG.

[You're exuberant about that]. I figured I'll start including stage directions for the blog so that you know how you feel about reading this.
I don't have much to write about the implementation itself. If you have a Gallium driver, then you're good to go and have accelerated 2d and 3d. It's great to see more state trackers being released for Gallium and this whole concept of "accelerating multiple APIs" on top of one driver architecture becoming a reality. [You love it and with a smile, wave bye]
Monday, March 23, 2009
Intermediate representation
I've spent the last two weeks working from our office in London. The office is right behind Tate Modern with a great view of London which is really nice. While there I was mainly trying to get the compilation of shaders rock solid for one of our drivers.
In Gallium3D currently we're using TGSI intermediate representation. It's a fairly neat representation with instruction semantics matching those from relevant GL extensions/specs. If you're familiar with D3D you know that there are subtle differences between seemingly similar instructions in both APIs. Modern hardware tends to follow the D3D semantics more closely than the GL ones.
Some of the differences include:
Especially because there's a lot generic transformations that really should be done on the IR itself. A good example of that includes code injections due to missing pieces of the API state on a given GPU, e.g. shadow texturing, including the compare mode, compare function and the shadow ambient value, which if the samplers on the given piece of hardware don't support need to be emulated with code injections after each sampling operation which utilizes the sampler that had the above mentioned state set on it.
Unfortunately right now the LLVM code in Gallium is far from finished. Due to looming deadlines on various projects I never got to spend the amount of time that is required to get in shape.
I'll be able to piggy back some of the LLVM work on top of the OpenCL state tracker code which is exciting.
Also, has anyone noticed that the last two blogs had no asides or any humor in them? I'm experimenting with "just what I want to say and nothing more" style of writing, wondering if it indeed, will be easier to read.
In Gallium3D currently we're using TGSI intermediate representation. It's a fairly neat representation with instruction semantics matching those from relevant GL extensions/specs. If you're familiar with D3D you know that there are subtle differences between seemingly similar instructions in both APIs. Modern hardware tends to follow the D3D semantics more closely than the GL ones.
Some of the differences include:
- Indirect addressing offsets, in GL offsets are between -64 to +63, while D3D wants them >= 0.
- Output to temporaries. In D3D certain instructions can only output to temporary registers while TGSI doesn't have that requirement.
- Branching. D3D has both if_comp and setp instructions which can perform essentially any of the comparisons, while we immitate them with a SLT (set on less than) or similar with a IF instruction with semantics of the one in GL_NV_fragment_program2.
- Looping.
Especially because there's a lot generic transformations that really should be done on the IR itself. A good example of that includes code injections due to missing pieces of the API state on a given GPU, e.g. shadow texturing, including the compare mode, compare function and the shadow ambient value, which if the samplers on the given piece of hardware don't support need to be emulated with code injections after each sampling operation which utilizes the sampler that had the above mentioned state set on it.
Unfortunately right now the LLVM code in Gallium is far from finished. Due to looming deadlines on various projects I never got to spend the amount of time that is required to get in shape.
I'll be able to piggy back some of the LLVM work on top of the OpenCL state tracker code which is exciting.
Also, has anyone noticed that the last two blogs had no asides or any humor in them? I'm experimenting with "just what I want to say and nothing more" style of writing, wondering if it indeed, will be easier to read.
Thursday, March 12, 2009
KDE graphics benchmarks
This is really a public service announcement. KDE folks please stop writing "graphics benchmarks". It's especially pointless if you quantify your blog/article with "I'm not a graphics person but...".
What you're doing is:
This is completely and utterly wrong.
I'll give you an analogy that should make it a lot easier to understand. Lets say you have a 1MB and a 100MB lan and you want to write a benchmark to compare how fast you can download a 1GB file on both of those, so you do:
Do you see the problem? Obviously the file hasn't been downloaded by the time you stopped the timer. What you effectively measured is the speed at which you can execute function calls.
And yes, while your original suspicion that the 100MB line is a lot faster is still likely true, your test in no way proves that. In fact it does nothing short of making some poor individuals very sad due to the state of computer science. Also "So what that the test is wrong, it still feels faster" is not a valid excuse, because the whole point is that the test is wrong.
To give your tests some substance always make your applications run for at least a few seconds reporting the frames per second.
Or even better don't write them. Somewhere out there, some people, who actually know what's going on, have those tests written. And those people, who just happen to be "graphics people" have reasons for making certain things default. So while you may think you've made this incredible discovery, you really haven't. There's a kde-graphics mailing list where you can pose graphics question.
So to the next person who wants to write a KDE graphics related blog/article please, please go through kde-graphics mailing list first.
What you're doing is:
timer start
issue some draw calls
timer stop
This is completely and utterly wrong.
I'll give you an analogy that should make it a lot easier to understand. Lets say you have a 1MB and a 100MB lan and you want to write a benchmark to compare how fast you can download a 1GB file on both of those, so you do:
timer start
start download of a huge file
timer stop
Do you see the problem? Obviously the file hasn't been downloaded by the time you stopped the timer. What you effectively measured is the speed at which you can execute function calls.
And yes, while your original suspicion that the 100MB line is a lot faster is still likely true, your test in no way proves that. In fact it does nothing short of making some poor individuals very sad due to the state of computer science. Also "So what that the test is wrong, it still feels faster" is not a valid excuse, because the whole point is that the test is wrong.
To give your tests some substance always make your applications run for at least a few seconds reporting the frames per second.
Or even better don't write them. Somewhere out there, some people, who actually know what's going on, have those tests written. And those people, who just happen to be "graphics people" have reasons for making certain things default. So while you may think you've made this incredible discovery, you really haven't. There's a kde-graphics mailing list where you can pose graphics question.
So to the next person who wants to write a KDE graphics related blog/article please, please go through kde-graphics mailing list first.
Monday, February 09, 2009
Video and other APIs
I read today a short article about video acceleration in X. First of all I was already unhappy because I had to read. Personally I think all the news sites should come in a comic form. With as few captions as possible. Just like Perl I'm a write-only entity. Then I read that Gallium isn't an option when it comes to video acceleration because it exposes programmable pipeline and I got dizzy.
So I got off the carousel and I thought to myself: "Well, that's wrong", and apparently no one else got that. Clearly the whole "we're connected" thing is a lie because it doesn't matter how vividly I think about stuff, others still don't see what's in my head (although to the person who sent me cheese when I was in Norway - that was extremely well done). So cumbersome as it might be I'm doing the whole "writing my thoughts down" thing. Look at me! I'm writing! No hands! sdsdfewwwfr vnjbnhm nhn. Hands back on.
I think the confusion stems from the fact that the main interface in Gallium is the context interface, which does in fact model the programmable pipeline. Because of the incredibly flexible nature of the programmable pipeline a huge set of APIs is covered just by reusing the context interface. But in modern GPUs there are still some fixed function parts that are not easily addressable by the programmable pipeline interface. Video is a great example of that. To a lesser degree so is basic 2D acceleration (lesser because some of the modern GPUs don't have 2D engines at all anymore).
But, and it's a big but ("and I can not lie" <- song reference, pointing out to let everyone know that I'm all about music) nothing stops us from adding interfaces which deal exclusively with the fixed function parts of modern GPUs. In fact it has already been done as the work on a simple 2D interface has already started.
Basic idea is that the state trackers which need some specific functionality use the given interface. For example the Exa state tracker would use the Gallium 2d interface, instead of the main context interface. In this case the Gallium hardware driver will have a choice: it can either implement the given interface directly in hardware, or it can use the default implementation.
The default implementation is something Gallium will provide as part of the auxiliary libraries. The default implementation will use the main context interface to emulate the entire functionality of the other interface.
Video decoding framework would use the same semantics. So it would be an additional interface(s) with default implementation on top of the 3D pipeline. Obviously some parts of the video support are quite difficult to implement on top of the 3D pipeline but the whole point of this is that: for hardware that supports it you get the whole shabangabang, for hardware that doesn't you get a reasonable fallback. Plus in the latter case the driver authors don't have to write a single line of hardware specific code.
So a very nice project for someone would be to take VDPAU, VA-API or any video framework of your choice and implement a state tracker for that API on top of Gallium and design an interface(s) that could be added to Gallium to implement the API in a way that makes full usage of the fixed functionality video units found in the GPUs. I think this is the way our XvMC state tracker is heading.
This is the moment where we break into a song.
So I got off the carousel and I thought to myself: "Well, that's wrong", and apparently no one else got that. Clearly the whole "we're connected" thing is a lie because it doesn't matter how vividly I think about stuff, others still don't see what's in my head (although to the person who sent me cheese when I was in Norway - that was extremely well done). So cumbersome as it might be I'm doing the whole "writing my thoughts down" thing. Look at me! I'm writing! No hands! sdsdfewwwfr vnjbnhm nhn. Hands back on.
I think the confusion stems from the fact that the main interface in Gallium is the context interface, which does in fact model the programmable pipeline. Because of the incredibly flexible nature of the programmable pipeline a huge set of APIs is covered just by reusing the context interface. But in modern GPUs there are still some fixed function parts that are not easily addressable by the programmable pipeline interface. Video is a great example of that. To a lesser degree so is basic 2D acceleration (lesser because some of the modern GPUs don't have 2D engines at all anymore).
But, and it's a big but ("and I can not lie" <- song reference, pointing out to let everyone know that I'm all about music) nothing stops us from adding interfaces which deal exclusively with the fixed function parts of modern GPUs. In fact it has already been done as the work on a simple 2D interface has already started.
Basic idea is that the state trackers which need some specific functionality use the given interface. For example the Exa state tracker would use the Gallium 2d interface, instead of the main context interface. In this case the Gallium hardware driver will have a choice: it can either implement the given interface directly in hardware, or it can use the default implementation.
The default implementation is something Gallium will provide as part of the auxiliary libraries. The default implementation will use the main context interface to emulate the entire functionality of the other interface.
Video decoding framework would use the same semantics. So it would be an additional interface(s) with default implementation on top of the 3D pipeline. Obviously some parts of the video support are quite difficult to implement on top of the 3D pipeline but the whole point of this is that: for hardware that supports it you get the whole shabangabang, for hardware that doesn't you get a reasonable fallback. Plus in the latter case the driver authors don't have to write a single line of hardware specific code.
So a very nice project for someone would be to take VDPAU, VA-API or any video framework of your choice and implement a state tracker for that API on top of Gallium and design an interface(s) that could be added to Gallium to implement the API in a way that makes full usage of the fixed functionality video units found in the GPUs. I think this is the way our XvMC state tracker is heading.
This is the moment where we break into a song.
Wednesday, February 04, 2009
Latest changes
I actually went through all my blog entries and removed spam. That means that you won't be able to find anymore links to stuff that can enlarge your penis. I hope this action will not shatter your lives and you'll find consolation in all the spam that you're getting via email anyway. And if not I saved some of the links. You never know, I say.
I also changed the template, I'd slap something ninja related on it, but I don't have anything that fits. Besides nowadays everyone is a graphics ninja. I'm counting hours until the term will be added to the dictionary. So my new nickname will be "The Lost Son of Norway, Duke of Poland and King of England". Aim high is my motto.
As a proud owner of exactly zero babies I got lots of time to think about stuff. Mostly about squirrels, goats and the letter q. So I wanted to talk about some of the things I've been thinking about lately.
Our friend ("our" as in the KDE communities, if you're not a part of it, then obviously not your friend, in fact he told me he doesn't like you at all) Ignacio CastaƱo has a very nice blog entry about 10 things one might do with tessellation.
The graphics pipeline continues evolving and while reading Ignacio's entry I realized that we haven't been that good about communicating the evolution of Gallium3D.
So here we go.
I've been slowly working towards support for geometry shaders in Gallium3D. Interface wise the changes are quite trivial, but a little bigger issue is that some (quite modern) hardware, while perfectly capable of emitting geometry in the pipeline is not quite capable of actually supporting all of the features of geometry shader extension. The question of how to handle that is an interesting one, because just simple emission of geometry via a shader is a very desirable feature (for example path rendering in OpenVG would profit from that).
I've been doing some small API cleanups lately. Jose made some great changes to the concept of surfaces, which became pure views on textures. As a follow up, over the last few days we have disassociated buffers from them, to make it really explicit. It gives drivers the opportunity to optimize a few things and with some changes Michel is working on avoid some redundant copies.
A lot of work went into winsys. Winsys, which is a little misnomer, was a part of Gallium that did too much. It was supposed to be a resource manager, handle command submission and handle integration with windowing systems and OS'es. We've been slowly chopping parts of it away. Making it a lot smaller and over the weekend managed to hide it completely from the state tracker side.
Keith extracted the Xlib and DRI code from winsys and put it into separate state trackers. Meaning that just like WGL state tracker, the code is actually sharable between all the drivers. That is great news.
Brian has been fixing and implementing so many bugs/features that people should start writing folk songs about him. Just the fact that we now support GL_ARB_framebuffer_object deserves at least a poem (not a big one, but certainly none of that white, non-rhyming stuff, we're talking full fledged rhymes and everything... You can tell that I know a lot about poetry can't you)
One thing that never got a lot of attention is that Thomas (who did get one of them baby thingies lately) released his window systems buffer manager code.
Another thing that didn't get a lot of attention is Alan's xf86-video-modesetting driver. It's a dummy X11 driver that uses DRM for modesetting and Gallium3D for graphics acceleration. Because of that it's hardware independent, meaning that all hardware that has a DRM driver and Gallium3D driver automatically works and is accelerated under X11. Very neat stuff.
Alright, I feel like I'm cutting into your youtube/facebook time and like all "Lost Sons of Norway, Dukes of Poland and Kings of England" I know my place, so that's it.
I also changed the template, I'd slap something ninja related on it, but I don't have anything that fits. Besides nowadays everyone is a graphics ninja. I'm counting hours until the term will be added to the dictionary. So my new nickname will be "The Lost Son of Norway, Duke of Poland and King of England". Aim high is my motto.
As a proud owner of exactly zero babies I got lots of time to think about stuff. Mostly about squirrels, goats and the letter q. So I wanted to talk about some of the things I've been thinking about lately.
Our friend ("our" as in the KDE communities, if you're not a part of it, then obviously not your friend, in fact he told me he doesn't like you at all) Ignacio CastaƱo has a very nice blog entry about 10 things one might do with tessellation.
The graphics pipeline continues evolving and while reading Ignacio's entry I realized that we haven't been that good about communicating the evolution of Gallium3D.
So here we go.
I've been slowly working towards support for geometry shaders in Gallium3D. Interface wise the changes are quite trivial, but a little bigger issue is that some (quite modern) hardware, while perfectly capable of emitting geometry in the pipeline is not quite capable of actually supporting all of the features of geometry shader extension. The question of how to handle that is an interesting one, because just simple emission of geometry via a shader is a very desirable feature (for example path rendering in OpenVG would profit from that).
I've been doing some small API cleanups lately. Jose made some great changes to the concept of surfaces, which became pure views on textures. As a follow up, over the last few days we have disassociated buffers from them, to make it really explicit. It gives drivers the opportunity to optimize a few things and with some changes Michel is working on avoid some redundant copies.
A lot of work went into winsys. Winsys, which is a little misnomer, was a part of Gallium that did too much. It was supposed to be a resource manager, handle command submission and handle integration with windowing systems and OS'es. We've been slowly chopping parts of it away. Making it a lot smaller and over the weekend managed to hide it completely from the state tracker side.
Keith extracted the Xlib and DRI code from winsys and put it into separate state trackers. Meaning that just like WGL state tracker, the code is actually sharable between all the drivers. That is great news.
Brian has been fixing and implementing so many bugs/features that people should start writing folk songs about him. Just the fact that we now support GL_ARB_framebuffer_object deserves at least a poem (not a big one, but certainly none of that white, non-rhyming stuff, we're talking full fledged rhymes and everything... You can tell that I know a lot about poetry can't you)
One thing that never got a lot of attention is that Thomas (who did get one of them baby thingies lately) released his window systems buffer manager code.
Another thing that didn't get a lot of attention is Alan's xf86-video-modesetting driver. It's a dummy X11 driver that uses DRM for modesetting and Gallium3D for graphics acceleration. Because of that it's hardware independent, meaning that all hardware that has a DRM driver and Gallium3D driver automatically works and is accelerated under X11. Very neat stuff.
Alright, I feel like I'm cutting into your youtube/facebook time and like all "Lost Sons of Norway, Dukes of Poland and Kings of England" I know my place, so that's it.
Sunday, February 01, 2009
OpenCL
I missed you so much. Yes, you. No, not you. You. I couldn't blog for a while and I ask you (no, not you) what's the point of living if one can't blog? Sure, there's the world outside of computers, but it's a scary place filled with people that, god forbid, might try interacting with you. Who needs that? It turns out that I do. I've spent the last week in Portland on the OpenCL working group meeting which was a part of the Khronos F2F.
For those who don't know (oh, you poor souls) OpenCL is what could be described as "the shit". That's the official spelling but substituting the word "the" for "da" is considered perfectly legal. Longer description includes the expansion of the term OpenCL to "Open Computing Language" with an accompanying wikipedia entry. OpenCL has all the ingredients, including the word "Open" right in the name, to make it one of the most important technologies of the coming years.
OpenCL allows us to tap into the tremendous power of modern GPUs. Not only that but also one can use OpenCL with accelerators (like the physics chips or Cell SPU's) and CPUs. On top of that hardware OpenCL provides both task-based and data-based parallelism making it a fascinating options for those who want to accelerate their code. For example if you have a canvas (Qt graphicsview) and you spend a lot of time doing collision detection, or if you have image manipulation application (Krita) and you spend a lot of time in effects and general image manipulation, or if you have a scientific chemistry application with an equation solver (Kalzium) and want to make it all faster, or if you have wonky hair and like to dance polka... OK, the last one is a little a fuzzy but you get the point.
Make no mistake, OpenCL is little bit more complicated than just "write your algorithm in C". Albeit well hidden, the graphics pipeline is still at the forefront of the design, so there are some limitations (remember that for a number of really good and few inconvenient reasons GPUs do their own memory management so you can not just move data structures between main and graphics memory). It's one of the reasons that you won't see a GCC based OpenCL implementation any time soon. OpenCL requires run time execution, it allows sharing of buffers with OpenGL (e.g. OpenCL image data type can be constructed from GL textures or renderbuffers) and it forces code generation to a number of different targets (GPUs, CPUs, accelerators). All those things need to be integrated. For sharing of buffers between OpenGL and OpenCL the two APIs need to go through some kind of a common framework - be it a real library or some utility code that exposes addresses and their meaning to both OpenGL and OpenCL implementations.
Fortunately we already have that layer. Gallium3D maps perfectly to the buffer and command management in OpenCL. Which shouldn't be surprising given that they both care about the graphics pipeline. So all we need is a new state tracker with some compiler framework integrated to parse and code generate from the OpenCL C language. LLVM is the obvious choice here because unlike GCC, LLVM has libraries that we can use for both (to be more specific it's Clang and LLVM). So yes, we started on an OpenCL state tracker, but of course we are far, far away from actual conformance. Being part of a large company means that we have to go through extensive legal reviews before releasing something in the open so right now we're patiently waiting.
My hope is that once the state tracker is public the work will continue at a lot faster pace. I'd love to see our implementation pass conformance with at least one GPU driver by summer (which is /hard/ but definitely not impossible).
For those who don't know (oh, you poor souls) OpenCL is what could be described as "the shit". That's the official spelling but substituting the word "the" for "da" is considered perfectly legal. Longer description includes the expansion of the term OpenCL to "Open Computing Language" with an accompanying wikipedia entry. OpenCL has all the ingredients, including the word "Open" right in the name, to make it one of the most important technologies of the coming years.
OpenCL allows us to tap into the tremendous power of modern GPUs. Not only that but also one can use OpenCL with accelerators (like the physics chips or Cell SPU's) and CPUs. On top of that hardware OpenCL provides both task-based and data-based parallelism making it a fascinating options for those who want to accelerate their code. For example if you have a canvas (Qt graphicsview) and you spend a lot of time doing collision detection, or if you have image manipulation application (Krita) and you spend a lot of time in effects and general image manipulation, or if you have a scientific chemistry application with an equation solver (Kalzium) and want to make it all faster, or if you have wonky hair and like to dance polka... OK, the last one is a little a fuzzy but you get the point.
Make no mistake, OpenCL is little bit more complicated than just "write your algorithm in C". Albeit well hidden, the graphics pipeline is still at the forefront of the design, so there are some limitations (remember that for a number of really good and few inconvenient reasons GPUs do their own memory management so you can not just move data structures between main and graphics memory). It's one of the reasons that you won't see a GCC based OpenCL implementation any time soon. OpenCL requires run time execution, it allows sharing of buffers with OpenGL (e.g. OpenCL image data type can be constructed from GL textures or renderbuffers) and it forces code generation to a number of different targets (GPUs, CPUs, accelerators). All those things need to be integrated. For sharing of buffers between OpenGL and OpenCL the two APIs need to go through some kind of a common framework - be it a real library or some utility code that exposes addresses and their meaning to both OpenGL and OpenCL implementations.
Fortunately we already have that layer. Gallium3D maps perfectly to the buffer and command management in OpenCL. Which shouldn't be surprising given that they both care about the graphics pipeline. So all we need is a new state tracker with some compiler framework integrated to parse and code generate from the OpenCL C language. LLVM is the obvious choice here because unlike GCC, LLVM has libraries that we can use for both (to be more specific it's Clang and LLVM). So yes, we started on an OpenCL state tracker, but of course we are far, far away from actual conformance. Being part of a large company means that we have to go through extensive legal reviews before releasing something in the open so right now we're patiently waiting.
My hope is that once the state tracker is public the work will continue at a lot faster pace. I'd love to see our implementation pass conformance with at least one GPU driver by summer (which is /hard/ but definitely not impossible).
Subscribe to:
Posts (Atom)
