Thursday, October 28, 2010

Intermediate representation again

I wrote about intermediate representation a few times already but I've been blogging so rarely that it feels like ions ago. It's an important topic and one that is made a bit more convoluted by Gallium.

The intermediate representation (IR) we use in Gallium is called Tokenized Gallium Shader Instructions or TGSI. In general when you think about IR you think about some middle layer. In other words you have a language (e.g. C, GLSL, Python, whatever) which is being compiled into some IR which then is transformed/optimized and finally compiled into some target language (e.g. X86 assembly).

This is not how Gallium and TGSI work or were meant to work. We realized that people were making that mistake so we tried to back-paddle on the usage of the term IR when referring to TGSI and started calling it a "transport" or "shader interface" which better describes its purpose but is still pretty confusing. TGSI was simply not designed as a transformable representation. It can be done, but it's a lot like a dance-off on a geek conference - painful and embarrassing for everyone involved.

The way it was meant to work was:

Language -> TGSI ->[ GPU specific IR -> transformations -> GPU ]

with the parts in the brackets living in the driver. Why like that? Because GPUs are so different that we thought each of them would require its own transformations and would need its own IR to operate on. Because we're not compiler experts and didn't think we could design something that would work well for everyone. Finally and most importantly because it's how Direct3D does it. Direct3D functional specification is unfortunately not public which makes it a bit hard to explain.

The idea behind it was great in both its simplicity and overabundance of misplaced optimism. All graphics companies had Direct3D drivers, they all had working code that compiled from Direct3D assembly to their respective GPUs. "If TGSI will be a lot like Direct3D assembly then TGSI will work with every GPU that works on Windows, plus wouldn't it be wonderful if all those companies could basically just take that Windows code and end up with a working shader compiler for GNU/Linux?!", we thought. Logically that would be a nice thing. Sadly companies do not like to release part of their Windows driver code as Free Software, sometimes it's not even possible. Sometimes Windows and Linux teams never talk to each other. Sometimes they just don't care. Either way our lofty goal of making the IR so much easier and quicker to adopt took a pretty severe beating. It's especially disappointing since if you look at some of the documentation e.g. for AMD Intermediate Language you'll notice that this stuff is essentially Direct3D assembly which is essentially TGSI (and most of the parts that are in AMD IL and not in TGSI are parts that will be added to TGSI) . So they have this code. In the case of AMD it's even sadder because the crucial code that we need for OpenCL right now is OpenCL C -> TGSI LLVM backend which AMD already does for their IL. Some poor schmuck will have to sit down and write more/less the same code. Of course if it's going to be me it's "poor, insanely handsome and definitely not a schmuck".

So we're left with Free Software developers who don't have access to the Direct3D functional spec and who are being confused by the IR which is unlike anything they've seen (pre-declared registers, typeless...) which on top of it is not easily transformable. TGSI is very readable, simple and pretty easy to debug though so it's not all negative. It's also great if you never have to optimize or transform its structure which unfortunately is rather rare.
If we abandon the hope of having the code from Windows drivers injected in the GNU/Linux drivers it becomes pretty clear that we could do better than TGSI. Personally I just abhor the idea of rolling out our own IR. IR in the true sense of that word. Crazy as it may sound I'd really like my compiler stuff to be written by compiler experts. It's the main reason why I really like the idea of using LLVM IR as our IR.

Ultimately it's all kind of taking the "science" out of "computer science" because it's very speculative. We know AMD and NVIDIA use it to some extend (and there's an open PTX backend for LLVM) , we like it, we use it in some places (llvmpipe), the people behind LLVM are great and know their stuff but how hard is it to use LLVM IR as the main IR in a graphics framework and how hard is it to code-generate directly from it for GPUs - we don't really know. It seems like a really good idea, good enough for folks from LunarG to give it a try which I think is really what we need; a proof that it is possible and doesn't require sacrificing any farm animals. Which, as a vegetarian, I'd be firmly against.