Thursday, March 12, 2009

KDE graphics benchmarks

This is really a public service announcement. KDE folks please stop writing "graphics benchmarks". It's especially pointless if you quantify your blog/article with "I'm not a graphics person but...".

What you're doing is:

timer start
issue some draw calls
timer stop

This is completely and utterly wrong.
I'll give you an analogy that should make it a lot easier to understand. Lets say you have a 1MB and a 100MB lan and you want to write a benchmark to compare how fast you can download a 1GB file on both of those, so you do:

timer start
start download of a huge file
timer stop

Do you see the problem? Obviously the file hasn't been downloaded by the time you stopped the timer. What you effectively measured is the speed at which you can execute function calls.
And yes, while your original suspicion that the 100MB line is a lot faster is still likely true, your test in no way proves that. In fact it does nothing short of making some poor individuals very sad due to the state of computer science. Also "So what that the test is wrong, it still feels faster" is not a valid excuse, because the whole point is that the test is wrong.

To give your tests some substance always make your applications run for at least a few seconds reporting the frames per second.

Or even better don't write them. Somewhere out there, some people, who actually know what's going on, have those tests written. And those people, who just happen to be "graphics people" have reasons for making certain things default. So while you may think you've made this incredible discovery, you really haven't. There's a kde-graphics mailing list where you can pose graphics question.
So to the next person who wants to write a KDE graphics related blog/article please, please go through kde-graphics mailing list first.


daniels said...

Get yo own style, son.

Anonymous said...

You mean kde-graphics-devel? Where people convene to discuss patches to ksnapshot or the maintainership of kruler? WIth the last message two months ago?

That's the august body that we should apply to for permission to write something on our blogs?

Ok, will indubitably do.

Zack said...

Yea, you're right, it's pointless. "Correct" or "true" is obviously not values we want to be supporting.

Anonymous said...

I don't know how many blog entries have been written to planet KDE with benchmarks, but the one made by Alexander Dymo has some explanation in the comments that you didn't reply: the second call doesn't return until is finished, so the timer is not stopped until is done.

He also comments that the code, tried to isolate what Konsole or Kate do.

Zack said...

I refuse to read blogs and respond to every comment them, especially that I don't have time to be even doing that in my own blog.
But to again reiterate it once again
timer start
timer stop
doesn't measure the time of doing doSomething. That's because graphics is asynchronous.
So doSomething dispatches the call, but whether or not any pixel has actually been touched in the process is arbitrary. So waiting with ending the timer until the call returns is pointless because, again, graphics is asynchronous and a function call return doesn't mean or signify anything (unless that call does synchronization/fencing of actual graphics, which in turn likely kills many performance optimizations that are otherwise done with it).

Anonymous said...

Ok, can you help me then. As i said, im no graphics person, all i know is that my code take 200ms to execute using the intel driver, 50ms to execute using the vesa driver, and 1ms to execute using the raster graphics system. Obviously there is a problem somewhere in the 'stack' (and i accept that it may be my shoddy code). Any ideas? Feel free to contact via email.

Zack said...

In your particular example there's a few problems. Some of them are in Qt, most of them in Xrender.
When you turn on anti-aliasing in Qt on X11 then Qt will render lines and points like it does paths, with XRenderCompositeTrapezoids which is incredibly slow. It would work but Qt coordinate system samples from pixel centers so a line (horizontal/vertical) that goes from 1-2 actually goes from 1.5-2.5 in pixel terms meaning that if you turn anti-aliasing the resulting line spans two screen pixels and Qt will try to anti-alias it. You can fix it two ways: one by turning off anti-aliasing for horizontal/diagonal lines, two by shifting coordinate system by 0.5, 0.5.

The other obvious optimization is instead of issuing a large number of drawSomething calls, issue one call drawSomethings call passing in the coordinates for all the lines/points in it (where Something is [Lines|Points]).
But again, mailing lists are a better forum for those discussions, so please understand if I won't respond here anymore.

Anonymous said...

Coincidentally, this isn't limited to graphics benchmarks;
Mozilla's dromeao JS/DOM benchmark suffers from a related issue. Basically, their benchmark looks like this:

1) Ask browser to draw stuff
2) Start timer
3) Do some javascript work
4) Stop timer.

The problem, of course, is that it's quite possible that X will be handling the drawing from step 1 in the middle of step 3, so on a slow/single-core system, one often ends up with measurements with something like +/-30% error margins.


test name said...

Does this also apply to painting on a QImage? I understood that when painting on a QPixmap it would go off to the server so a benchmark like that is pretty pointless, but when painting on raster/QImage it would actually be how long it took to paint the image. Is this incorrect?

Anonymous said...

So the gap between Qt's raster and X11 engine in Adam's example is probably even bigger, isn't it?

Anonymous said...

Heh, no, for the blog post I did on the N810 the benchmark was written by Zack Rusin and had the fps :)

For qpaintengine_raster the calls are synchronous.


Zack said...

@Adam: I think the person meant Adam Pigg.

As to the QImage/qpaintengine_raster synchronism it's only true if you don't display the actual image.
Obviously the overall cost of rendering to an image is "the cost of rendering the image" + "the cost required to send the image to the display device and display it". The second part is asynchronous.