Zack Rusin

NV path rendering

2011-09-18T01:27:00.000+01:00

A while ago NVIDIA released drivers with their NV_path_rendering extension. GL_NV_path_rendering is a relatively new OpenGL extension which allows rendering of stroked and filled paths on the GPU.

I've heard about it before but lately I've been too busy with other stuff to look at, well, anything.

I've decided to spend a few seconds looking at the videos NVIDIA posted on their site. They were comparing the NV_path_rendering, Skia, Cairo and Qt. Pretty neat. Some of the demos were using huge paths, clipped by another weird path and using perspective transforms. Qt was very slow. It was time for me to abandon my dream of teaching my imaginary hamster how to drive a stick shift and once again look at path rendering.

You see, I wrote the path rendering code so many times that one of my favorite pastimes was creating ridicules paths that no one would ever think about rendering and seeing how fast I could render them. Qt's OpenGL code was always unbelievably good at rendering paths no one would ever render. Clearly these people were trying to outcrazy me.

Fortunately there's an SDK posted on the NVIDIA site and it's really well done. It even compiles and works on GNU/Linux. Probably the best demo code for a new extension that I've ever seen. The extension itself is very well done as well. It's very robust, ultimately though it's the implementation that I care about. I have just one workstation with an NVIDIA card in it, a measly Quadro 600, running on a dual processor xeon e5405, but it was enough to play with it.

The parts using Qt were using the raster engine though. I've looked at the code and decided to write something that would render the same thing but using just Qt. The results were a little surprising. Qt OpenGL could render tiger.svg scaling and rotating it at about 270fps, while the NV_path_rendering was running at about 72fps. Here's both of them running side by side:

(numbers lower for both on account of them running at the same time of course). As you can see Qt is almost 4x faster. I've figured it might be related to the different SVG implementations and rendering techniques used, so I quickly hacked the demo NVIDIA posted to open a brand new window (you need to click on it to start rendering) and render to QGLPixelBuffer but using the same SVG and rendering code as their NV_path_rendering demo code. The results were basically the same.

I posted the code for the Qt demo and the patch to nvpr_svg on github: https://github.com/zackr/qt_svg

The patch is larger than it should be because it also changed the file encoding on the saved files from DOS to Unix but you shouldn't have any issues applying it.

So from a quick glance it doesn't seem like there are any performance benefits to using NV_path_rendering, in fact Qt would likely be quite a bit slower with it. Having said that NVIDIA's implementation looks very robust and a lot more numerically stable. I've spent a little bit of time looking at the individual pixels and came away very impressed.

In general the extension is in a little bit of a weird situation. On one hand, unlike OpenVG which creates a whole new API, it's the proper way of introducing GPU path rendering, on the other hand pretty much every vector graphics toolkit out there already implements GPU based path rendering. Obviously the implementations differ and some might profit from the extension but for Qt the question is whether that quality matters more than the performance. Specifically whether the quality improves enough to justify the performance hit.

I think the extension's success will largely depend on whether it's promoted to, at least an EXT or, ideally an ARB, meaning all the drivers support it. Using it would make the implementations of path rendering in toolkits/vector graphics libs a lot simpler and give driver developer a central place to optimize a pretty crucial part of the modern graphics stack. Unfortunately if you still need to maintain the non NV_path_rendering paths then it doesn't make a whole lot of sense. Mesa3D implementation would be trivial simply because I've already implemented path rendering for OpenVG using the Gallium3D interface, so it'd be a matter of moving that code but I'm just not sure if anyone will be actually using this extension. All in all, it's a very well done extension but it might be a little too late.

ApiTrace

2011-04-25T21:21:00.015+01:00

During the last three weeks I've spent most of my spare time writing a GUI for Jose's amazing ApiTrace project. ApiTrace is a project to trace, analyze and debug graphics api's. Both OpenGL and Direct3D. To some extend inspired by gDEBugger and Windows PIX. We wanted a tool that would let us slice through huge games and CAD apps to the exact call which causes problems and be able to inspect the entire graphics state, including the shaders, textures and all the buffers. We ended up doing that, plus a lot more and we're just getting started. In other words it's the best thing since "Human/Robot Emancipation Act of 3015".

You begin by tracing your target application. You can do that either from the console or from the GUI. A trace file is created and we can do some amazing things with it. You can open it in a GUI and

Inspect the state frame by frame, draw call by draw call:
Replay the trace file:
Check every texture:
Every bound framebuffer:
Every shader:
Every vertex buffer:
You can see if OpenGL threw an error at any point during the replay and if so what was it:
And to go completely nuts, as graphics developers like to do, you get to edit any shader, any uniform and large chunks of the state to immediately see the effects it would have on the rendering:

As a driver developer you no longer have to install all the games just to debug a problem, the report can simply include a short trace which you can use to immediately figure out what's wrong. As an application developer you can inspect every graphics call your app makes, you can analyze your api usage and you could automatically produce standalone testcases which you can send to driver developers.

ApiTrace is hosted on github and it's BSD licensed. It works on Linux and Windows (we're planning to add OSX support as well). Gui is written using Qt and requires the QJson library.

Jose just announced the first release so let us know if there's anything that would make your life a lot easier. Next is support for multiple GL contexts, ability to export just a single frame from a trace (either as a trace file or a standalone C application), ability to start and stop tracing on a hot key and lots of other features soon. So whether you're a driver developer, working on games, CAD apps or 2D scene-graphs this is a tool that should make your life significantly easier and better.

2D musings

2010-11-02T14:56:00.002+00:00

If you've been following graphics developments in the 2D world over the last few years you've probably seen a number of blogs and articles complaining about performance. In particular about how slow 2D is on GPUs. Have you ever wondered why it's possible to make this completely smooth but your desktop still sometimes feels sluggish?

Bad model

For some weird reason ("neglect" being one of them) 2D rendering model hasn't evolved at all in the last few years. That is if it has evolved at all since the very first "draw line" became a function call. Draw line, draw rectangle, draw image, blit this, were simply joined by fill path, stroke path, few extra composition modes and such. At its very core the model remained the same though, meaning lots of calls to draw an equally large number of small primitives.

This worked well because technically zero, or almost zero, setup code was necessary to start rendering. Then GPUs became prevalent and they could do amazing things but to get them to do anything you had to upload the data and the commands that would tell them what to do. With time more and more data had to be sent to the GPU to describe the increasingly complex and larger scenes. It made sense to optimize the process of uploads (I keep calling them "uploads" but "GPU downloads" is closer to the true meaning) by allowing to upload an entire resource once and then refer to it via a handle. Buffers, shaders, addition of new shading stages (tessellation, geometry) all meant to reduce the size of data that had to be uploaded to the GPU before every rendering.

At least for games and well designed 3D software. 2D stuck to its old model of "make GPU download everything on every draw request". It worked ok because most of the user interface was static and rather boring so the performance was never much of an issue. Plus in many cases the huge setup costs are offset by the fact that the Graphics Processing Units are really good at processing graphics.

Each application is composed of multiple widgets each widget draws itself using multiple primitives (pixmaps, rectangles, lines, paths) and each primitive needs to first upload the data needed by the GPU to render it. It's like that because from the 2D api perspective there's no object persistence. The api has no idea that you keep re-rendering the same button over and over again. All the api sees is another "draw rectangle" or "draw path" call which it will complete.

On each frame the same data is being copied to the GPU over and over again. It's not very efficient, is it? There's a limited number of optimizations you can do in this model. Some of the more obvious ones include:

adding unique identifiers to the pixmaps/surfaces and using those as identifiers as keys in a texture cache which allows you to create a texture for every pixmap/surface only once,

collecting data from each draw call in a temporary buffer and copying it all at once (e.g. in SkOSWindow::afterChildren, QWindowSurface::endPaint or such),

creating a shader cache for different types of fills and composition modes

But the real problem is that you keep making the GPU download the same data every frame and unfortunately that is really hard to fix in this model.

Fixing the model

It all boils down to creating some kind of a store where lifetime of an object/model is known. This way the scene knows exactly what objects are being rendered and before rendering begins it can initialize and upload all the data the items need to be renderer. Then rendering is just that - rendering. Data transfers are limited to object addition/removal or significant changes to their properties and then further limited by the fact that a lot of the state can always be reused. Note that trivial things like changing the texture (e.g. on hover/push) don't require any additional transfers and things like translations can be limited to just two floats (translation in x and y) and they're usually shared for multiple primitives (e.g. in a pushbutton it would be used by the background texture and the label texture/glyphs)

It would seem like the addition of QGraphicsView was a good time to change the 2D model, but that wasn't really possible because people like their QPainter. No one likes when a tool they have been using for a while and are fairly familiar with is suddenly taken away. Completely changing a model required a more drastic move.

QML and scene-graph

QML fundamentally changes the way we create interfaces and it's very neat. From the api perspective it's not much different from JavaFX and one could argue which one is neater/better but QML allows us to almost completely get rid of the old 2D rendering model and that's why I love it! A side-effect of moving to QML is likely the most significant change we've done to accelerated 2D in a long time. The new Qt scene graph is a very important project that can make a huge difference to the performance, look and feel of 2D interfaces.

Give it a try. If you don't have OpenGL working, no worries it will work fine with Mesa3D on top of llvmpipe.

A nice project would be doing the same in web engines. We have all the info there but we decompose it into the draw line, draw path, draw rectangle, draw image calls. Short of the canvas object which needs the old style painters, everything is there to make accelerated web engines a lot better at rendering the content.

More 2D in KDE

2009-08-14T19:04:00.003+01:00

An interesting question is: if the raster engine is faster at gross majority of graphics and Qt X11 engine falls back on quite a few features anyway why shouldn't we make raster the default until OpenGL implementations/engine are stable enough.

There are two technical reasons for that. Actually that's not quite true, there's more but these two are the ones that will bother a lot of people.

The first is that we'd kill KDE performance over network. So everyone who uses X/KDE over network would suddenly realize that their setup became unusable. Their sessions would suddenly send images for absolutely everything all the time... As you can imagine institutions/schools/companies who use KDE exactly in this way wouldn't be particularly impressed if suddenly updating their installations would render them unusable.

The second reason is that I kinda like being able to read and sometimes even write text. Text tends to be pretty helpful during the process of reading. Especially things like emails and web browsing get a lot easier with text. I think a lot of people shares that opinion with me. To render text we need fonts, in turn those are composed of glyphs. To make text rendering efficient, we need to cache the glyphs. When running with the X11 engine we render the text using Xrender, which means that there's a central process that can technically manage all the glyphs used by applications running on a desktop. That process is the Xserver. With the raster engine we take Xserver out of the equation and suddenly every single application on the desktop needs to cache the glyphs for all the fonts they're using. This implies that every application suddenly uses megs and megs of extra memory. They all need to individually cache all the glyphs even if all of them use the same font. It tends to work ok for languages with a few glyphs e.g. 40+ for english (26 letters + 10 digits + a few punctuation marks). It doesn't work at all for languages with more. So unless it will be decided that KDE can only be used by people with languages whose alphabets contain about 30 letters or less, then I'd hold off with making raster engine the default.

While the latter problem could be solved with some clever shared memory usage or forcing Xrender on top of raster engine (actually I shouldn't state that as a fact, I haven't looked at font rendering in raster engine in a while and maybe that was implemented lately), it's worth noting that X11 engine is simply fast enough to not bother over a few frames in this way or another. Those few frames that you'd gain would mean unusable KDE for others.

And if you think that neither of the above two points bothers you and you'd still want to use raster engine by default you'll have to understand that I just won't post instructions on how to do that here. If you're a developer, you already know how to do it and if not there are trivial ways of finding that out from the Qt sources. If you're not a developer then you really should stick globally to the defaults and can simply test the applications with the -graphicssystem switches.

2D in KDE

2009-08-14T01:21:00.007+01:00

So it seems a lot of people is wondering about this. By this I mean why dwarfs always have beards. Underground big ears would be probably a better evolutionary trait, but elfs got dibs on those.

Qt, and therefore KDE, deals with 3 predominant ways of rendering graphics. I don't feel like bothering with transitions today, so find your own way from beards and dwarfs to Qt/KDE graphics. Those three ways are:

On the CPU with no help from the GPU using the raster engine
Using X11/Xrender with the X11 engine
Using OpenGL with the OpenGL engine

There's a couple of ways in which the decision about which one of those engines is being used is made.

First there's the default global engine. This is what you get when you open a QPainter on a QWidget and its derivatives. So whenever you have code like


void MyWidget::paintEvent(QPaintEvent *)
{
 QPainter p(this);
 ...
}

you know the default engine is being used. The rules for that are as follows:

GNU/Linux : X11 engine is being used
Windows : Raster engine is being used
Application has been started with -graphicssystem= option :
- -graphicssystem=native the rules above apply
- -graphicssystem=raster the raster engine is being used by default
- -graphicssystem=opengl the OpenGL engine is being used by default

Furthermore depending on which QPaintDevice is being used, different engines will be selected. The rules for that are as follows:

QWidget the default engine is being used (picked as described above)
QPixmap the default engine is being used (picked as described above)
QImage the raster engine is being used (always, it doesn't matter what engine has been selected as the default)
QGLWidget, QGLFramebufferObject, QGLPixelBuffer the OpenGL engine is being used (always, it doesn't matter what engine has been selected as the default)

Now here's where things get tricky: if the engine doesn't support certain features it will have to fallback to one engine that is sure to work on all platforms and have all the features required by the QPainter api - that is the raster engine. This was done to assure that all engines have the same feature set.

While OpenGL engine should in general never fallback, that is not the case for X11 and there are fallbacks. One of the biggest immediate optimizations you can do to make your application run faster is to assure that you don't have fallbacks. A good way to check for that is to export QT_PAINT_FALLBACK_OVERLAY and run your application against a debug build of Qt, this way the region which caused a fallback will be highlighted (the other method is to gdb break in QPainter::draw_helper). Unfortunately this will only detect fallbacks in Qt.

All of those engines also use drastically different methods of rendering primitives.
The raster engine rasterizes primitives directly.
The X11 engine tessellates primitives into trapezoids, that's because Xrender composites trapezoids.
The GL engine either uses the stencil method (described in this blog a long time ago) or shaders to decompose the primitives and the rest is handled by the normal GL rasterization rules.

Tessellation is a fairly complicated process (also described a long ago in this blog). To handle degenerate cases the first step of this algorithm is to find intersections of the primitive. In the simplest form think about rendering figure 8. There's no way of testing whether the given primitive is self-intersecting without actually running the algorithm.
To render with anti-aliasing on the X11 engine we have to tessellate. We have to tessellate because Xrender requires trapezoids to render anti-aliased primitives. So if the X11 engine is being used and the rendering is anti-aliased whether you're rendering a line, heart or a moose we have to tessellate.

Someone was worried that it's a O(n^2) process which is of course completely incorrect. We're not using a brute force algorithm here. The process is obviously O(nlogn). O(nlogn) complexity on the cpu side is something that both the raster and X11 engines need to deal with. The question is what happens next and what happens in the subsequent calls.

While the raster engine can deal with all of it while rasterizing, the X11 engine can't. It has to tessellate, send the results to the server and hope for the best. If the X11 driver doesn't implement composition of trapezoids (which realistically speaking most of them doesn't) this operation is done by Pixman. In the raster engine the sheer spatial locality almost forces better cache utilization than what could be realistically achieved by the "application tessellate->server rasterization" process that the X11 engine has to deal with. So without all out acceleration in this case X11 engine can't compete with the raster engine. While simplifying a lot it's worth remembering that in terms of cycles register access is most likely smaller or equal to 1 cycle, access to L1 data cache is likely about 3 cycles, L2 is probably about 14 cycles, while the main memory is about 240 cycles. So for CPU based graphics efficient memory utilization is one of the most crucial undertakings.

With that in mind, this is also the reason why a heavily optimized purely software based OpenGL implementation would be a lot faster than raster engine is at 2D graphics. In terms of memory usage OpenGL pipeline is simply a lot better at handling memory than the API QPainter provides.

So what you should take away from this is that if you're living in the perfect world, the GL engine is so much better than absolutely anything else Qt/KDE have it's not even funny, X11 follows it and the raster engine trails far behind.

The reality with which you're dealing with is that when using the X11 engine, due to the fallback you will be also using the raster engine (either on the application side with Qt raster engine or the server side with Pixman) and unfortunately in this case "the more the better" doesn't apply and you will suffer tremendously. Our X11 drivers don't accelerate chunks of Xrender, the applications don't have good means of testing what is accelerated, so what Qt does is simply doesn't use many of its features. So even if the driver would accelerate for example gradient fills and source picture transformations it wouldn't help you because Qt simply doesn't use them and always falls back to the raster engine. It's a bit of a chicken and an egg problem - Qt doesn't use it because it's slow, it's slow because no one uses it.

The best solution to that conundrum is to try running your applications with -graphicssystem=opengl and report any problems you see to both Qt software and the Mesa3D/DRI bugzillas because the only way out is to make sure that both our OpenGL implementations and OpenGL usage in the rendering code on the applications side are working efficiently and correctly. The quicker we get the rendering stack to work on top of OpenGL the better off we'll be.

KDE graphics benchmarks

2009-03-12T14:14:00.002+00:00

This is really a public service announcement. KDE folks please stop writing "graphics benchmarks". It's especially pointless if you quantify your blog/article with "I'm not a graphics person but...".

What you're doing is:


   timer start
   issue some draw calls
   timer stop

This is completely and utterly wrong.
I'll give you an analogy that should make it a lot easier to understand. Lets say you have a 1MB and a 100MB lan and you want to write a benchmark to compare how fast you can download a 1GB file on both of those, so you do:


   timer start
   start download of a huge file
   timer stop

Do you see the problem? Obviously the file hasn't been downloaded by the time you stopped the timer. What you effectively measured is the speed at which you can execute function calls.
And yes, while your original suspicion that the 100MB line is a lot faster is still likely true, your test in no way proves that. In fact it does nothing short of making some poor individuals very sad due to the state of computer science. Also "So what that the test is wrong, it still feels faster" is not a valid excuse, because the whole point is that the test is wrong.

To give your tests some substance always make your applications run for at least a few seconds reporting the frames per second.

Or even better don't write them. Somewhere out there, some people, who actually know what's going on, have those tests written. And those people, who just happen to be "graphics people" have reasons for making certain things default. So while you may think you've made this incredible discovery, you really haven't. There's a kde-graphics mailing list where you can pose graphics question.
So to the next person who wants to write a KDE graphics related blog/article please, please go through kde-graphics mailing list first.

SVG in KDE

2008-08-28T18:28:00.002+01:00

"Commitment" is one of the words that have never been used in this blog. Which is pretty impressive given that I've managed to use such words as sheep, llamas, raspberries, ninjas, donkeys, crack or woodchuck quite extensively (especially impressive in a technology centric blog).

That's because commitment implies that whatever it is one is committed to plays an important role in their life. It's a word that goes beyond the paper or the medium on which it was written. It enters the cold reality that surrounds us.

But today is all about commitment. It's about commitment that KDE made to a technology broadly refereed to as Scalable Vector Graphics. I took some time off this week and came to Germany where I talked about usage of SVG in KDE.

The paper about, what I like to call, the Freedom of Beauty, is available here:

https://www.svgopen.org/2008/papers/104-SVG_in_KDE/

It talks about the history of SVG in KDE, the rendering model used by KDE, it lists ways in which we use SVG and finally shows some problems which have been exposed by such diverse usage of SVG in a desktop environment. Please read it if you're interested in KDE or SVG.

Hopefully this paper marks a start of a more proactive role KDE is going to be playing in shaping of the SVG standard.

Fast graphics

2008-08-20T16:39:00.007+01:00

Instead of highly popular pictures of llamas today I'll post a few numbers. Not related to llamas at all. Zero llamas. These will be Qt/KDE related numbers. And there's no llamas in KDE. There's a dragon, but he doesn't hang around with llamas at all. I know what you're thinking: KDE is a multi-coltural project surely someone must be chilling with llamas. I said it before and I'll say it again, what an avarage KDE developer, two llamas, one hamster and five chickens do in a privacy of their own home is none of your business.

Lets take a simple application, called qgears2, based on David Reveman cairogears and see how it performs with different rendering backends. Pay attention to zero relation to llamas or any other animals. The application takes a few options, -image: to render using a CPU based raster engine, -render: to render using X11's Xrender and -gl to render using OpenGL (-llama option is not accepted). It has three basic tests, "GEARSFANCY" which renders a few basic paths with a linear gradient alpha blended on top, TEXT that tests some very simple text rendering and COMPO which is just compostion and scaling of images.

The numbers come from two different machines. One is my laptop which is running Xorg server version 1.4.2. Exa is 2.2.0. Intel driver 2.3.2. GPU is 965GM, CPU is T8300 at 2.4GHz running on Debian Unstable's kernel 2.6.26-1.
The second machine is running GeForce 6600 (NV43 rev a2), NVIDIA proprietary driver version G01-173.14.09, Xorg version 7.3, kernel 2.6.25.11, CPU is Q6600 @ 2.40GHz (thanks to Kevin Ottens for those numbers, as I don't have NVIDIA machine at the moment).

The results for each test are as follows:

GEARSFANCY
	I965	NVIDIA
Xrender	35.37	44.743
Raster	63.41	41.999
OpenGL	131.41	156.250

TEXT
	I965	NVIDIA
Xrender	13.389	40.683
Raster	(incorrect results)	(incorrect results)
OpenGL	36.496	202.840

COMPO
	I965	NVIDIA
Xrender	67.751	66.313
Raster	81.833	70.472
OpenGL	411.523	436.681

COMPO test isn't really fair because as I mentioned Qt doesn't use server side picture transformations with Xrender but it shows that OpenGL is certainly not slow at it.

So what these results show is that GL backend, which hasn't been optimized at all, is between 2 to 6 times faster than anything out there and that pure CPU based Raster engine is faster than the Xrender engine.

So if you're on an Intel GPU, or NVIDIA GPU rendering using GL will immediately make your application a number times faster. If you're running on a system with no capable GPU then using raster engine will make your application faster as well.
Switching Qt to use GL backend by default would result in all applications running a magnitude times faster. The quality would suffer though (unless HighQualityAntialiasing mode would be used in Qt in which case it would be the same). This certainly would fix our graphics performance woes and as a side-effect allow using GL shaders right on the widgets for some nifty effects.
On systems with no GPU raster engine is a great choice, on everything else GL is clearly the best option.

Accelerating desktops

2008-06-27T17:44:00.002+01:00

In general I'm extremely good at ignoring emails and blog posts. Next to head-butting it is one of the primary skills I've developed while working on Free Software. Today I will respond to a few recent posts (all at once, I'm a mass-market responder) about accelerating graphics.

Some kernel developers released a statement saying that binary blobs are simply not a good idea. I don't think anyone can argue that. But this statement prompted a discussion about graphics acceleration, or more specifically a certain vendor who is, allegedly, doing a terrible job at it.

First of all the whole discussion is based on a fallacy rendering even the most elaborate conclusions void. It's assumed that in our graphics stack there's a straight forward way between accelerating an api and fast graphics. That's simply not the case.

I don't think it's a secret that I'm not a fan of XRender. Actually "not a fan" is an understatement I flat out don't like it. You'd think that the fact that 8 years after its introduction we still don't have any driver that is actually real good at accelerating that "simple API" would be a sign of something... anything. When we were making Qt use more of the XRender api the only way we could do that is by having Lars and I go and rewrite the parts of XRender that we were using. So what happened was that instead of depending on XRender being reasonably fast we rewrote the parts that we really needed (which is realistically just the SourceOver blending) and did everything else client side (meaning not using XRender)

Now going back to benchmarking XRender. Some people pointed out an application I wrote a while back to benchmark XRender: please do not use it to test a performance of anything. It will not respond to any real workloads. (also if you're taking something I wrote to prove some arbitrary point, it'd be likely a good idea to ping me and ask about it. You know on account of writing it, I just might have some insight into it). The thing about XRender is that there's a large amount of permutations for every operation. Each graphics framework which uses XRender uses specific, defined paths. For example Qt doesn't use server-side transformations (they were just pathetically slow and we didn't feel it would be in the best interest of our users to make Qt a lot slower), Cairo does. Accelerating server side transformations would make Cairo a lot faster, and would have absolutely no effect on Qt. So whether those tests pass with 20ms or 20hours has 0 (zero) effect on Qt performance.

What I wanted to do with the XRender performance benchmarking application is basically have a list of operations that need to be implemented in driver to make Qt, Cairo or anything else using XRender fast. "To make KDE fast look at the following results:" type of thing. So the bottom line is that if one driver has for example result of 20ms for Source and SourceOver and 26 hours for everything else and there's second driver that has 100ms for all operations, it doesn't mean that on average driver two is a lot better for running KDE, in fact it likely means that running KDE will be five times faster on driver one.

Closed sourced drivers are a terrible thing and there's a lot of reasons why vendors would profit immensely from having open drivers (which is possibly a topic for another post). Unfortunately I don't think that blaming driver writers for not accelerating graphics stack which we went out of our way to make as difficult to accelerate as possible is just a good way of bringing that point forward.

KHTML future

2007-10-23T23:57:00.000+01:00

I've read Harri's blog about WebKit and I figured it makes sense for someone to respond. First of all I liked the blog, It was full of drama, action, despair, marketing and bad and good characters. Which is really what I'm looking for when reading fiction.

Especially the part that mentioned QtWebKit as an irrelevant fork of KHTML sources. That was awesome. It's the kind of imagination we need more of in the blogosphere. For the purposes of the point Harri was trying to make, which I think was "no matter what's the reality, our ego is bigger than yours", it was a well suited argument.

Describing the WebKit project as a fork of KHTML sources is like calling GCC a fork of EGCS, or to use a more popular analogy it's like calling chicken a fork of an egg. If you want to talk about forks then technically nowadays KHTML is a fork of WebKit. Not a terribly good one at that. It's real easy to back that statement up by comparing the number of submits to KHTML to the number of submits to WebKit. In fact that comparison is just embarrassing for KHTML.

I also found it funny that people like Lars Knoll, Simon Hausmann, George Staikos or myself are not part of the KHTML team. "We are the 'KHTML team' (except KHTML's author and ex-main developer Lars who's one of the biggest supporters of WebKit now and other people who used to work on KHTML but now work on WebKit as well... but they were all ugly... honestly!)" you can go make shirts with that.
We're working on WebKit now hence we're not KHTML team members. Any KDE developer who works on WebKit (hey, Niko, Rob, Adam, Enrico...) is automatically dissociated from the KHTML team.

The fact is that there is more KDE developers contributing to WebKit than there is KDE developers contributing to KHTML.

So since there's more of us, I think technically that means that we are the official KDE web engine team. KHTML team, we would love to work with you, the fork, but you're kind of a pain in the butt to deal with.

Which is ok, because like I mentioned a number of times KDE community lives of the "who does the work decides" dogma. And ultimately the Apple guys, the Trolltech guys, people from George's company who work on this stuff full-time and tons of Free Software contributors working on WebKit do much, much more work than people do on KHTML.

On a more serious note, let me explain a very important point: bug for bug compatibility with the latest Safari would be worth much, much more to KDE than any patches that are in KHTML and haven't been yet merged to WebKit could ever be worth.
Web works on the principle of percentages - web-designers test their sites with engines that have X% of market reach. Konqueror with stock KHTML isn't even on their radar. WebKit is. Having web designers cater towards their engine is worth more than gold to KDE users.

And if you care more about some personal grudges than the good of KDE, that's also OK, because we, the official KDE web rendering team will do what's right for KDE and use WebKit.

Small steps

2007-08-20T10:20:00.000+01:00

I was on vacations last week but I'm being all jealous of my luggage. It got a free trip around the world. During my last 8 flights my luggage has been lost 5 times. Is that a record? Confetti anyone? It's a celebration. If you're going to meet me during any of the upcoming conferences I'll be the outgoing and highly sarcastic naked guy with a sign on my chest saying "for my face look this way" and an arrow pointing up.

I neglected to mention that, as Simon said, QtWebKit is working on Windows. Simon did an amazing job of porting all the quirks of the build system but "amazing" is the default state for all of his code so it's not a surprise at all. While he was doing that I've sat down and ported XML tokenizer to QXmlStream from LibXML. If you never wrote a web rendering tokenizer (and unless you're crazy, the chances of that are pretty high, and if you did you're crazy and won't remember doing it anyway) you know that "fragile" is a term that nicely describes it. After it was ported Lars and I sat down to fix the regressions and they didn't even know what hit them (ha! ninja reference).

In other news I've merged in FreeType2 rasterization algorithm patches in Qt. Our raster engine, uses the beauty that is FreeType's rasterizer, with a few patches on top. Because they break BC in FreeType's public interfaces we can't merge them back at the moment. In any case the patches improve rendering speed in general antialiased paths of the raster engine (meaning on Windows, Qtopia Core and in general whenever rendering to a QImage) by about 10% which is gangsta awesome ("gangsta awesome" is a very high level of awesomeness, at least judging from MTV).

I've also optimized the path clipping code. Andreas uses the path clipping code in GraphicsView for collision detection, so when I say "path clipping code" you should read "path clipping and GraphicsView collision detection". A lot of the time in that algorithm has been spent on vertex allocation for tested paths. I've used a few tricks to speed it up by about 15%. The code for that algorithm is the number two reason why baby seals die (the first is still undisputed). It's not even the algorithm itself but the inherent complexity of the problem. I'm a big fan of computational geometry in computer graphics because it makes grown man cry, except me and I like feeling like the lean, mean, killing machine that I am. My favorite part of the path clipping problem is that there are two ways of solving the precision problems and neither of them really works. The trick is that paths operate in double coordinate system, efficient snap-rounding implementations that I've seen operate in fixed-point coordinate system which falls apart in this case because of absolutely random distribution of vertices across the full double spectrum. Tessellation and clipping itself can be done in a screen coordinate system, which makes it possible to consistently represent your coordinates with fixed-point representation. That doesn't work for paths because, e.g. boolean operations on paths need to be done in native path coordinates not screen coordinates. So the algorithm forces an absolutely crazy mix of dynamic fixed-point size, reduced-predicates, magic and good-will to work. Aren't you happy that I'm doing it for you? You better be.

Yours(1) Latino(2) Lover(3)

1) Not really "yours", more "community". I love "you" but "you" need to realize that I need to be seeing other people.
2) Not really "Latino". Unless of course my Spanish or Brazilian friends would like to name me an "honorary Latino" or "Latino by association". I'd be definitely down with that. The only food I can make that is eatable and doesn't force the fire department to evacuate the building before are nachos. I'm a definition of grace in the kitchen. "Whatever you have in the kitchen I will make it burn" is my motto. Plus I'm sporting quite an attitude to boot. "Make Zack a Latino" campaign. We can make it work!
3) Not really "lover". More "no feelings haver". Though technically I've worked on software for so long that hate is, next to sarcasm, my primary export.

Web on canvas and Dashboard widgets

2007-07-22T14:35:00.001+01:00

There are days when I do something quite interesting and in my mind I can almost see myself on a stage in tight, tight spandex pants, long hair, perm, cowboy boots yelling angrily "are you ready to roock?!". People cheering, babies laughing, women throwing their bra's on the stage. It's poetic. Then I remember that I'm a computer scientist and I snap right out of it. I go back to the life filled with math equations on napkins, sleepless nights in front of buzzing computers, stacks of books in corners and no spandex pants (although I can deal with the last one just fine). The fact that I hate rock lessens the blow, but it doesn't make it any less disappointing. So in those moments of sadness I blog, yearning attention and approval, so readily available on the internet. Cough, cough...

I was wondering how hard would it be to create a QGraphicsItem that uses QtWebKit to render pages on a canvas. The idea being that combining full blown canvas like QGraphicsView framework with web rendering engine would give us quite a killer combination. So I've sat down today and done it. At first I had to redo some of the rendering code in QtWebKit and once I was finished I had a QWebGraphicsItem that beautifully renders pages. It being a QGraphicsItem all the effects available to graphics items in Qt are available for free to it. So you can animate, scale, rotate, perspective-transform and do a whole bunch of neat effects on it for free. Once I've done that I figured that it's obvious that this is the best way of getting Apple's Dashboard widgets to work. So I've done that too. I quickly hacked up a class that reads-in Apple Dashboard widget bundles and can render them on a QWebGraphicsItem. The compatibility is not 1:1 quite yet, because some of the Dashboard widgets use JavaScript objects that I haven't implemented yet, like AddressBook object. To be honest I'm not 100% sure whether I want to implement them, I think we can get those things done a lot nicer, it's just a question of whether 1:1 compatibility with Apple Dashboard is worth the extra effort needed to make all those JavaScript objects work on KDE.
First a screenshot of one Apple Dashboard widget rendered and on top a scaled to half its size KDE homepage:

Now a Dashboard widget with a perspective-transformed dot.kde.org page. Since this is QGraphicsView I can interact with the item while it's transformed so I've selected some text on it.

Crackalicious. (no drugs were used while hacking on this, but I did touch myself a little after getting it to work). Furthermore (yes, there's more... what can I say, I'm a giver...) in QtWebKit we have this neat interface that allows you to inject QObject's into the framework as JavaScript objects at run-time, so adding new JavaScript objects is trivial and getting Opera widgets to work would be very, very simple. No spandex pants included though.

Scripter

2007-07-14T15:59:00.000+01:00

Before boarding a flight I'm eagerly awaiting the security presentation that is about to ensue. I figure that these people spent their valuable time learning how to point to the ground, back, forward and to the side and someone needs to appreciate that effort. During my last flight I even went ahead and tried to inspect my life jacket. "Tried" because as it turned out my seat was missing it. As the security announcement advised, I "calmly" informed the fellow sitting next to me that "in the unlikely event of a water landing" he's screwed because as a stronger man I'm taking his life jacket. This also prompted me to think about tools that make our life easier (just to clarify, when I say "our" I mean "my"). I've spent a little time yesterday creating a tool that will make my (when I say "my" I mean "our") life a lot easier. So today I wanted to tell him (when I say "him" I mean "you") about us (when I say "us" I mean "it").

I've spent a lot of time writing simple C++ applications to test out some kind of rendering algorithm. Internally we had a tool that automated a lot of it. The tool uses a very simplistic, reg-exp based language to specify commands. I wanted something more powerful. This is how scripter came to be. Scripter is a very simple application that uses QtScript's bindings to Arthur to do its rendering. It allows for rapid prototyping of algorithms and most importantly for me, quick testing of Qt's rendering framework. At first it was a whole IDE with its own code editor, very quickly though I decided to remove the editor and just make it a content widget that monitors the file it was opened with for changes. The reason for that is that I wanted to keep working in my own editor and just have a dynamic, visual preview of everything I was doing. So with Scripter one can be writing rendering code while the visuals effects of the editing are immediately visible. Here are two screenshots of examples included with it:

But the whole beauty of this application is ability to create animations, while seeing the changes done in real-time. I recorded two demos:

The first one shows me just playing around with an example. In this case it's a freedesktop.org clock.
The second shows me writing a simple animation from scratch. This one has an added benefit of seeing me hack in real-time, no copy&pasting, a little bit of chaos (next time I should probably figure out what kind of animation I want to do before start recording, but oh, well, live and learn). Fun. This one is 12mb though.

Scripter requires Qt 4.4. If you don't have Qt 4.4's snapshot it won't work. Get it from the SVN at labs.trolltech.com with :

 svn co svn://labs.trolltech.com/svn/graphics/scripter

Mirroring widgets

2007-06-04T11:43:00.001+01:00

"The life of man is divided between waking, dreaming and dreamless sleep." or so it's written in "The Upanishads"... I wouldn't know because the cartoon version still hasn't been released and I refuse to spend even a second of my life pumping a stream of information into my brain that hasn't been properly sprinkled with commercials and product placement. Which reminds me: use Qt...

I can almost see you sitting at your desk with the same expression the great Plato had when he said: "What?" (not one of his greatest quotes, but I'm sure he said it at one point or the other). In my last blog I described how the engineering department at Trolltech spent the last few months fixing bugs in, what could be described as, a constant "waking" state. By natural progression the next Qt release is putting us in the "dreaming" state.

It's been a while since the last time that I've posted an example on how to do something funky so today I'll partially make up for it. I get a lot of questions asking me how to do something windowing system specific. People at the office can approximate the exact time of delivery of each one of the emails relating to that topic as the seismographic vibrations, originating in the vicinity of the area where the table meets with my head (in a repeated and aggressive fashion), cause ripples to appear in coffee mugs around the office. Hopefully today's example will satisfy the most vicious desires for X11 wackiness.

6'2" (height by association - as measured by the height of the author), weighting at about 139 lines of code (while wearing the license header), the undisputed (mainly because the only one) champion (questionably) of... well, nothing: QX11Mirror. QX11Mirror is a class that can monitor and return the contents of any X11 window in real time. So you could start your favorite media player, pass its window id to QX11Mirror and then render the contents half-the-size with perspective transformation. It would look like this:
One thing that you can't see is that the contents of the movie is updated while the movie is playing and the perspective transformation is animating. All of which is done in real-time. It's really cool.

The original reason for writing this, seemingly, silly example was not "making something pretty". Even though the inability to make applications look gorgeous is the number one cause of hair loss (as shown in a study by doctor "me". Note: "me" is not really a doctor. In fact "me" doesn't even fulfill grammatical requirements of the previous sentences.), which is number one reason for you not looking gorgeous and I'm a big proponent of having KDE developers look beautiful. "KDE - we got hair... In all the right places...". (I know... I'm as shocked as you are that I'm not being paid a zillion dollars to do marketing.) No, the original reason for all of this was to make web plugins behave a lot nicer. Currently the problem is that they don't compose correctly within the web rendering tree. So what I wanted to do is correctly fetch the offscreen contents of those windows, render them in correct stacking order and propagate the events back to them. This goes along the "make stuff work" ideology which I consider myself to be a big fan off. Oh, and here's Flash plugin rendering inside of a Qt applications (as always in my example, things are animating inside):
Oh, and the code is available at Graphics Dojo.

Keyhole

2007-05-30T18:05:00.000+01:00

Today's blog might sound a little muffled, that's because it's coming straight from the heart. Which is in direct opposition to all of the "What's Open Source community missing" blogs/articles, which are coming straight from a vastly less prominent body part. To not dwell to much on human anatomy, I'm going to move on to the main topic today which is peace, love and Qt 4.3. The first two are overrated and got their fair share of treatment in all kinds of literary works, therefore the only reasonable conclusion is that my perky-self focuses on Qt today.

A number of people surely have already pointed out in their blogs that Qt 4.3.0 has been released. The first thing you'll notice about this release is the version number. Rightfully so because 4.3.0 is the highest Qt version we've ever released. 4.2.0 was already taken and we felt very strongly about reusing a ".0". Although "13.13.0" was available we really came together as a team/body/unit/crew (pick one) to release 4.3.0.

Now, I'm not going to be doing marketing for 4.3.0 (mainly because others are being paid better to do that) or listing the "like totally awesome new features", what I wanted to do is present the perspective of people who actually spent days and nights working to make this piece of software the best they could. Whether from the loins of those geeks came something exceptional is a judgement call that I leave to you. This is your keyhole into our world.

The main focus of this release was for us the general increase in quality of Qt. In the darkness of our meeting rooms we were moving tons of paper, while from time to time lonely tears danced on our cheeks (due the fact that the light smoke coming from our pens and pencils was irritating our eyes). Immersed in this mysterious darkness, we sat and watched. The cracking of the projector, seemed to be the only noise that dared to challenge the insanable silence. We knew that we wanted to make people smile a little more with this new release. Very early in the release process we also agreed that shipping drugs with Qt was not an option. We turned for help to the happiness champions - Care Bears and Popples. We watched and analyzed. In the very end we decided that while we agree that Gi-Joe's are like totally cooler (the engineering department is unfortunately male dominated) we know what we have to do. We came to the conclusion that people seem to be a lot happier if things work the way they planned and there's no unpleasant surprises. That's what we focused on. Fixing bugs, making sure that things work the way they are expected to, all in all improving the quality of Qt. If you have ever spent extended periods of time just staring at the code and rerunning tests in order to fix bugs, you know it's a mundane process that takes a lot of concentration. This is what we've been doing for the last few months. Fixing bugs and running around screaming (a lot of screaming, not a whole lot of running) if someone broke one of the tests. Have you ever seen engineers play ping-pong after fixing bugs for weeks? Oh, it's quite a sight. A lot of raw energy (very raw, one could say that energy hardly touched by any kind of skill). Flying balls, paddles and often engineers were a common sight late in the evening (low flying engineers bring bad luck - especially if the flight schedule predicts a landing at "you").

No one is more critical of us than we are but we work together to improve all the things we don't like. While I was thinking about this today I couldn't stop thinking about Pythagoreans who despite achieving many great things, were often described as a group who cherished authority beyond anything else. That approach is completely different from any discussion we have here. Every argument is judged solely based on its soundness no matter who's making it. I think that what I'm trying to say in this, severely not hysterical, paragraph is that one thing you can be sure off is that every decision we have reached and every change we have made in Qt was not due to any kind of hidden agenda, religious beliefs or beauty of people pushing for it. It was done because 20+ engineers decided that the arguments for it are stronger than against it. I realize that due to the fact that we have those discussions at the office and in person, a lot of the transparency of them remains hidden behinds clouds for people outside Trolltech. We're working to improve the flow of the information from the heart of Qt's home to the outside and while we do that please remain assured that behind the clouds there's a world adhering to the strictest physical and logical rules and not an ocean of strings with muppets drowning underneath.

Having said that, we're very happy with Qt 4.3.0 so hopefully you'll enjoy it too.

If you want pure Qt release goodness with pictures of the chosen few who get to stare at Qt code until their eyes bleed make sure you read Girish's blog.

Reflections

2007-03-09T10:13:00.000+00:00

In the spirit of my never ending "pimp your Qt application" series comes another example. This time, "how to make iTunes-like album selector". It's funny how much attention this widget got. I looked at it yesterday on one of the Macs at the office and just implemented it. I probably should make it a view for a list model but for now it's just a simple widget. It runs at perfectly smooth 60 frames per second while utilizing ~7% cpu on my 3GHz Pentium4 with NVIDIA GeForce 6600, so by no means a monster of a machine. Oh, and of course this is all done with pure Qt. Reflections are actually a vector effect, so they would work equally well for any vector based graphics (by using the same code as this example, you can reflect your svg's or even whole widgets with no problem). Mandatory screenshot:
And as with every animated example a movie (again, framerate of the screencapture does not come close to the real world performance) is available here . Finally the code is available here.

Gradient bounds

2007-03-07T12:02:00.000+00:00

The most unfriendly thing about rendering gradients with Qt has been the fact that you had to specify gradients in coordinates of the shape they were to be rendered on. It wasn't ideal especially for all applications which included any kind of animations or were rendering large number of items because it meant that the gradient had to be individually created for each and every one of items/frames. I've fixed that two days ago by adding a coordinate-mode property to QGradient, which now accepts ObjectBoundingMode. Object bounding mode, just like in SVG, means that the gradient coordinates are percentages of bounding box of the shape that the gradient is about to fill. So all the coordinates are between 0 and 1 and Qt automatically adjusts the bounds for gradient when it's being rendered. This makes it possible to easily use QGradient's with QPalette. An example where I'm drawing a bunch of rectangles and animate them along while the gradient is set only once with (0,0, 1, 1) coordinate box (meaning starting at the topleft and ending at the bottom right corner of each rectangle).
And since it's an animation (including a widget show/hide effect) here's a movie showing it in action.