I've spent a little bit of time on our tessellator again. It seems that every time I do that another problem pops up. It's such a trivial algorithm that it's starting to become irritating. I'm getting more and more motivated to write a paper about it. There's a number of very fundamental issues here.
In the basic case, meaning simple polygons, tessellation is trivial and there are papers describing techniques to do in linear time like this or this. The problem is that most polygons that we currently decompose are not simple. Self-intersecting polygons with holes are very common and that's where most of our algorithms falls apart.
There are two papers describing handling such complex polygons: An implementation of a Near-Linear Polygon Triangulation Algorithm for General Polygons and
FIST: Fast Industrial-Strength Triangulation of Polygons. The detail that is usually not mentioned is that handling self-intersecting polygons in those algorithms has severe effects on their complexity (from nearly-linear to n^2*log*n).
Which leads to trivial conclusion #1: if tessellation of simple polygons is almost linear then would it be faster to unwind complex polygons and tessellate created this way simple polygons rather than tessellate directly one complex polygon. The only challenging aspect of unwinding polygons is finding the points where the polygon self-intersects. As a side-effect we venture into a lot better researched part of computational geometry - segment intersections. We can apply Bentley&Ottmann algorithm between the edges of a polygon and mix it with any of the snap-rounding techniques to try to handle precision problems.
Now this operates in nlogn but the catch is that when rendering large number of simple non-self-intersecting shapes we do a whole lot of work for absolutely nothing.
That in turn leads to conclusion #2: when rendering large amounts of simple primitives our fancy tessellation algorithm falls apart because it does a lot of work for absolutely nothing. Algorithmically there's nothing one can do here, you have to have a little bit of higher-level knowledge that can help you figure out whether the following primitive to be rendered is simple or not.
And now my latest thorn: when rendering complex polygons we generate a lot of trapezoids (due to the fact that almost every vertex forces generation) which after rasterization all fit within one scanline (happens very often when rendering on a big viewport mapped to a small window). So when the server is rasterizing those trapezoids it does a lot of work for nothing. The horizontal red spans in the following image show trapezoids which visually won't add anything to the rendering (after rasterization of them that is).
I'm still a little bit torn as to at which step should merging of those trapezoids actually happen.