I remember that the OpenGL occlusion query code was especially glitchy-- it had a bad tendency to make things become invisible for no apparent reason! Context: I wrote Optimine (early graphics optimization mod) and MCRegion (optimized file format that became the Region/Anvil formats).
I'm not sure why they're doing a full flood-fill-- why compute connectivity instead of visibility? Knowing the answers to "entering through face A, can a ray exit through face B?" should solve the same problem and also be easier to compute.
Hey, generalizing the answer like that is nice, the flood fill is definitely an approximation of that.
However keep in mind that the flood fill is run only once, not per frame, so you have to check that "any ray from face A can exit through any point of face B", so I'm not sure it is easier to compute.
I'm sure there is some way to get closer to this than a flood fill, but for 0.9 it had to be good enough!
For the actual occlusion test, you have a "don't go backwards" rule. If you have to go backwards to reach a chunk, you know it's not visible along that path (but may be on a different path, which you will eventually find). Without having thought about it a whole lot, does that work for the inside-a-chunk tests too? If you have to traverse an edge facing back towards the face you're coming from, does that mean there's no visible path from the first edge to the second? And if so, would that give a meaningful reduction in connectivity?
There's a problem with that I think. If I understand correctly, the inside-a-chunk tests are filled directionless, so this wouldn't work.
Basically it performs a flood fill on contiguous groups of all the transparent blocks in the chunk. If any group touches two sides they are considered to be able to see each other. Keep in mind this is precomputed and doesn't have a "looking-at" direction.
Perhaps you could perform this test six times, one for each face. Then you would have a direction.
Partly answering my own question after mulling it over, the difference in the pre-pass is that you're not actually coming "from" an edge or going "to" one when you're doing the flood fills, you're just testing which edges are connected to any given empty cell. But I still think there might be something to this idea. The simplest thing I can think of is to do no-backwards searches out from all the empty cells on each edge and see which other edges you reach. That's potentially 6x as many searches (slightly smaller and with some early outs since the connectivity is bi-directional), but at least it's still finite and predictable and isn't who knows how many raycasts. I don't know if that's acceptable for the pre-pass or not. But I also think there might be smarter ways to do it that only require a single search from each empty cell but remember directions traversed so you know if a particular path you've reached an edge along implies visibility or not.
Maybe you can do the 6 searches in parallel, such that whenever they meet you know you've found a connection between the two edges they were coming from? I don't know if that means any less work except in cases where all 6 edges are trivially connected.
That's a fair point. The interior shape of most of the chunks you're handling is mostly convex anyways, so flood-fill's conservative visibility determinations will usually be equal to the precise numbers.
Glitchiness of occlusion may be driver or hardware dependent. I've been using it successfully, but haven't tested on a variety of hardware. The bigger issue is the two way communication between GPU and CPU. You want your pipeline to be one way, otherwise you either have stalls waiting for results (poor or inconsistent framerates), or things becoming visible one or more frames later than they should. For some applications, the latter is acceptable. For Minecraft (style) chunks, it might mean sometimes large amounts of terrain popping in a frame or two too late, which could be very noticeable.
It seems like a combination of both might be feasible. Do something smart like this article on the CPU, which feeds into a greatly reduced set of occlusion queries, small enough to perhaps not introduce significant stalls using the first method.
However, in this article's case, he's specifically targeting the mobile edition of minecraft, where hardware occlusion is not yet universally available! So fast CPU versions are definitely warranted, and I think he's got a really good solution.
Just FYI this is a very specific optimization for mc. Normally you have static terrain and precisely pre-calculate what geometry is hidden from different regions and field of view and bake that information into the world. It's quite slow to do properly though. Mc generates dynamic terrains, so this is a dumber, faster version of that which is less accurate in identifying hidden geometry but fast enough to calculate when the world is generated and recalculate when terrain and field of view changes. The trick is finding the best heuristic for the geometry to maximize hidden surface detection without any false positives(since any missing geometry is probably going to be pretty noticeable).
This is a very nice couple of blog posts, the effort put into the javascript algorithm demos is quite impressive. I enjoyed reading them, thanks to Tommo and Peter Cooper for the post.
I would have thought maybe the ravine problem can be solved by using the heuristics to detect when the search appears to be going down a ravine and then begin raycasting.
How many polys does a MineCraft scene normally run to? I don't know if they already do this, but for my mobile game I run a depth pre-pass where I render depth only with a no-op shader to build a full depth buffer, which then means that all fragments behind the depth get discarded and you only run the fragment shader (the expensive part) for those pixels which are actually drawn.
This gives me good frame rates (30fps+) on a nexus 4 for reasonable detailed geometry (including large procedural terrain) whilst still being able to run various post-process effects (ssao, bloom).
It's possible that the minecraft scenes are that much more complex though.
Hey, Tommaso here.
I've thought about using a depth prepass, however that really helps only if you have heavy shaders, at the cost of basically doubling the vertex shader load.
Given that our terrain shader is an one liner which returns tex*color, and that we have A LOT of vertices, it's not very convenient.
The poly count at max render distance (224 blocks) can be anywhere between 300K in plains and 900K polygons in jungles, minus the savings of the culling in the post. Apart from jungles which kill everything though, the major bottleneck is now alphatesting and most devices will run at 60fps if you turn that off.
Hello Tommaso,
I developed something VERY similar to that algorithm back in 2007 (C++, software rendering), and then again in 2012 (in Java for Android, with GLES).
I got some of your issues fixed back then. Ravines, for instance. You might want to take a look:
Thanks for the reply. That makes sense; my scenes are probably 100-200k, but my fragment shaders are much more complex so the depth pre-pass makes a huge difference.
It's not often an HN item's title is changed away from the article's official title but for the original title to have been more descriptive. It seems odd without the Minecraft reference that it so hinges on.
So is the minecraft codebase terrible? I would guess it is based on the classes of bizarre bugs that seem to exist. I could almost believe Bethesda had a hand in its development.
I'm guessing Mojang is worth at least a Billion dollars because they make second most popular computer game in history; I'll take that kind of "terrible" any day of the week and consider myself blithely lucky.
There are some very terrible coding decisions made within the minecraft community.
The open source server (which is based I believe on the orginal server) actually doesn't use function names to describe what the function does, but functions are given names like, "a","b","c","d" in alphabetical order.
The upstream minecraft server is a JAR that was run through a Java bytecode obfuscator, which mostly works by renaming all functions and fields with names like that. I'm not aware of any open-source server built with such unreadable naming from the start, but there are definitely _decompiled_ versions floating around. Since the original JAR was obfuscated, these decompiled versions are only partially deobfuscated, if at all, hence the unreadable names.
This sounds very similar to http://en.wikipedia.org/wiki/Potentially_visible_set
I'm not sure why they're doing a full flood-fill-- why compute connectivity instead of visibility? Knowing the answers to "entering through face A, can a ray exit through face B?" should solve the same problem and also be easier to compute.