Blockopedia

Joe Bolder at Egmont sent me a copy of their book “Blockopedia”. The book came in a nice box:

Blockopedia Box

The design of the book is interesting. It fits awkwardly in a bookshelf without the companion box.

Blockopedia

Joe sent the book as thanks for my work on Chunky. Nearly all pages of the book have a nice background image rendered with Chunky:

Blockopedia Inside

It is really awesome to know that Chunky renders are used in real books. I’ve previously noticed images in other books that were clearly rendered with Chunky, for example the Minecraft handbook series. Those seem to sell pretty well because I have seen them in several of bookstores. The Minecraft handbooks are even sold in my local grocery store!

I’m sure the Blockopedia book will do well too, and the images in it might be seen by millions of people. To me that is a very cool thought!

Analyzing Traffic Spikes

The current Chunky documentation site has been live since June. I installed Google Analytics on it for fun. When looking at the overview on Analytics a couple days ago I noticed that there had been two very large spikes on October 10 and 19:

Traffic Spikes

You can see a more normal period before and after the two spikes in the graph above. The number of visitors during the spikes was some orders of magnitude greater than during a normal period. I was immediately curious about where the visitor spikes came from. When I looked at the referral traffic, i.e. traffic caused by people clicking links on other sites, I saw only the second spike:

Referral Traffic

Although minecraftforum is the largest total referral source in the above image, when I narrowed the time period to just around the 19th and 20th, Kotaku was clearly the largest traffic source. In Google Analytics you can see even more detailed reports about which page a referral came from, and using this I found a Kotaku article about a beautiful Chunky render that was posted to the Minecraft subreddit on October 19.

So that explained the second traffic spike, but what about the first? There was only a slight hump in the referral graph at the 10th of October, but a large spike in direct traffic, i.e. many people just typed the address into their browsers. Here is the graph of direct traffic:

Direct Traffic

I went back to looking at the few referrals and saw that there were slightly more referrals from YouTube than normal. Then I realized that if someone sees, or hears, a URL in a video they might type the address directly in their browser rather than check the video description for a link. I could not see referrals from a particular video (the video ID parameter seems to be filtered out by Google Analytics), but I did see referrals from the channel of a particular YouTube user named alexelcapo. I went to alexelcapo’s YouTube page and found a video posted on October 10 which seems to be a tutorial for using Chunky. I have never heard about alexelcapo, but they have nearly 1 million subscribers so it seems plausible that their video caused the October 10 traffic spike.

Chunky 2014 Fall Update

It has been a few months since the last Chunky posts on this blog, and since I have referred people interested in the development process to this blog I thought it was time to give an update on the current status of Chunky.

Some of you may already be aware that I am a PhD student. This is in practice a more-than-full-time occupation. I work on Chunky mostly during my summer vacation, and a bit during Christmas. I can’t let Chunky interfere with my work, and that is the main reason that development has been on a hiatus now since summer.

With that in mind, I have some short-term goals for Chunky that I hope I can reach when I have time to tinker with Chunky again. First, entity rendering should be added in a snapshot release. There is some partial entity support currently that I have in my development branch. The rendering pipeline and scene graph is in need of some significant refactoring in order to implement entities in a nicer way, so it might require a lot of work.

During the summer I worked on some improvements for sky rendering, but there is more that can be done in that area. I would like to improve the fog/atmosphere effects. This also ties in to the rendering pipeline refactoring. The new 3D cloud feature was hampered by a really bad rendering pipeline architecture.

In the distant future I would like to add an entity posing tool and a material editor. These are large components that would take a lot of work to design and implement but I think they would be really great additions to Chunky.

Of course, the often-requested GPU rendering support would also be fun to continue work on.

I don’t know when I can start working on the entity support again. My work is taking up much of my “free time” currently. I might be able to code some on Chunky around Christmas, hopefully!

A Short Build Script is a Happy Build Script

Back in ye olde days I used GNU Make to build most of my projects, that is if I used a build script at all. GNU Make, for all of its inscrutable syntax and unexpected complexity, is still a stable and useful tool if I just want to make a small project. A nice minimal Makefile looks like this:

CC=gcc
main: main.c
    $(CC) main.c -o main

That doesn’t look too bad, right? Don’t worry, some people have managed to write truly frightening Makefiles using this initially friendly-looking system. As evidence I present to you an image of the latex-makefile:

The huge Latex-makefile

The Makefile above contains over four thousand lines of code. Sure, much of that is comments, but that is still a big whopping load of lines, and a lot of complexity, for something that should be simple. Maybe all of that logic should really be in a plugin to the build system. Then everyone could build LaTeX projects without reproducing the work of the above monstrosity!

Apache Ant

When I started working with larger Java projects I was introduced to Apache Ant. This was the way I learned to build Java projects and I didn’t put much thought into it, initially. A simple build.xml for an Ant build might look like this:

<project name="foo" default="build">
    <task name="build" description="compile Java sources">
        <javac src="src" dest="bin"/>
    </task>
</project>

Again, a trivial build requires only a simple build script. Unfortunately, Ant build scripts are written in XML. This was a horrible idea, and one of the major flaws of Ant. XML seems great to developers (well, some developers) because it is easy to parse and serialize, so they are happy to use it in their applications. However, the tragedy is that XML is human-readable, so it is used in situations where someone will have to edit that awful syntax.

Anyhow, Ant seemed limited at first, yet it is possible to bend the rules and put some logic into your build despite that it seems like there should be no logic in an Ant script. This was both a blessing and a curse. A blessing because it meant I could use Ant for all kinds of stuff, and a curse because it meant that I used Ant for all kinds of stuff where I would have been better off with a better build system.

One of the projects I have been working on had the following Ant scripts:

Ant Scripts and configuration

Considering the size of this particular Java project, these build scripts are not too bad really. Other Ant builds for projects of similar size could easily be twice as large.

The project above has four variants; there is one main script (the left one), and four scripts for each variant (plus a small configuration file). All variants are quite similar, and there is much duplicated code between the variants. Although the main build script is relatively simple I think the build was overall too verbose and too complex.

Apache Maven

Maven is a build system and project description model that has some nice ideas:

  • How to build a project is implicitly derived from the specification of the project
  • The build system should manage dependencies
  • Build-by-convention is great (assume default configurations for source location and output directory, etc.)

In general I think that was a good idea to try to unify project description and build system, but it was taken a bit too far with Maven. Also, regrettably, Maven also uses XML. The flaws of XML are even more apparent in Mavens project object model files (pom.xml) since they often contain large chunks of boilerplate stuff.

I have never really used Maven, though the benefits over Ant are very clear. Being able to remove your libraries from the repository feels super tidy and neat, and build-by-convention means your simplest projects just work!

Gradle

When I started using Gradle it was amazing, this build system gets almost everything completely right. Finally, a well-maintained build system for Java that does not use XML!!!

Let’s look at a simple build script again:

apply plugin: 'java'

That’s it! Well, you will have to follow the conventions and put your code where Gradle expects it, but then it will just work.

So what happens for larger projects? I converted the build system for my Ant example above to use Gradle, and this is what I ended up with:

Gradle build

The left-most part of the image shows the build script, and the other parts are just configuration files. To be able to simplify the build this much I use a plugin that I developed for building JastAdd projects. Most such projects are very similar, and the plugin will be re-usable for other such projects.

Bottom Line

I won’t talk more about the benefits of Gradle here or why I have started to love it so much, I just want to focus on pure code size. Obviously my Ant scripts are larger, but why is that bad? In general, more lines of code means:

  • It is more difficult to navigate the code.
  • Developers who are unfamiliar with the project will have a harder time to read and/or understand the build script.
  • More complex constructs are probably used to circumvent limitations in the build system. For example, in Ant you can have conditional execution and loops but it requires a lot of trickery.
  • The script becomes harder to maintain because of such difficult constructs.

The complexity argument may not necessarily be true, for example maybe the build system is just needlessly verbose. Maybe a very concise build script is more difficult to understand because it uses more implicit behavior. I personally do not believe this. I believe that when there is a large difference in number of lines then the shorter solution is mostly more concise and probably a lot easier to understand.

The three images above were all rendered with the same text size. I think it is quite fun to compare them and look at the patterns that are used in the three build systems. Looking at a zoomed-out view of source code is surprisingly effective to visualize the complexity of the code.

LR Parser Generator

A few weeks ago I decided that I wanted to write a parser generator. This is one of those projects I’ve wanted to do for a long time but never got started on. When I finally started I decided that it would be fun to learn Scala at the same time, so I wrote the parser generator in Scala.

One of the things that motivated me to get started writing a parser generator is some ideas I have about how parser generator diagnostic messages can be improved. Currently, LR parser generators provide only the most limited information about a shift-reduce or reduce-reduce conflict. For example, the Beaver parser generator can give you something like this:

Warning: Resolved Shift-Reduce conflict by selecting (PLUS: SHIFT; goto 3) over (PLUS: REDUCE E = E PLUS E) using precedence.

It can take a lot of time, ever for an expert, to figure out what the problem is and how to fix the grammar to remove the conflict. I think this can be significantly improved.

Currently my parser generator only generates an LR(0) parser, though my goal is to make it generate LR(1) parsers. The Scala code became surprisingly compact. For example, here is the main parse loop of the parser after the action and goto tables have been generated:

def parse(in: LookaheadBuffer[Symbol], goal: ItemSet) {
  val state = new scala.collection.mutable.Stack[ItemSet]
  val stack = new scala.collection.mutable.Stack[ParseTree]
  state push goal
  while (true) {
    while (reduceRules contains state.top) {
      val rule = reduceRules(state.top)
      val lhs = rule.lhs
      var children = List[ParseTree]()
      for (i <- 1 to rule.rhs.size) {
        state.pop
        children = children :+ stack.pop
      }
      stack push new ParseTree(lhs, children)
      state push map(state.top)(lhs)// goto
    }
    val sym = if (!in.eof) in.pop else EOFToken
    if (sym != WhitespaceToken) {// skip whitespace
      val action = actions(state.top)(sym)
      action match {
        case Accept =>
          println("accept")
          stack.pop.printTree
          return
        case shift: Shift =>
          stack push new ParseTree(sym)
          state push shift.next
      }
    }
  }
}

I wouldn’t be surprised if there was some way to shave ten more lines off that loop.

Premature Optimization in Java

A few months ago I implemented the Möller-Trumbore ray-triangle intersection algorithm (trivia: the inventor, Tomas Möller, works in the graphics group at my CS department). My initial implementation looked like this (slightly edited):

// e1 and e2 are triangle edges, o is origin
// ray = input ray
Vec pvec = new Vec(0, 0, 0);
Vec qvec = new Vec(0, 0, 0);
Vec tvec = new Vec(0, 0, 0);

pvec.cross(ray.d, e2);
double det = e1.dot(pvec);
if (det > -EPSILON && det < EPSILON)
    return false;

tvec.sub(ray.o, o);

double u = tvec.dot(pvec) / det;

if (u < 0 || u > 1)
    return false;

qvec.cross(tvec, e1);

double v = ray.d.dot(qvec) / det;

if (v < 0 || (u+v) > 1)
    return false;

double t = e2.dot(qvec) / det;
return t < tNear;

After making sure the above was correct I started optimizing it. First I replaced the three divisions by det by calculating the reciprocal of det and then multiplying with that. This replaces three divisions with one division and three multiplications. This should be an improvement if floating point division is at least 1.5 times slower than floating point multiplication, which is generally true.

Next, I removed the temporary Vec objects. This is where things get interesting. The idea is to avoid allocating these temporary objects on the heap and replace them with stack-allocated local variables. The code gets a lot bigger because you have to inline all the cross product and dot product code in the intersection algorithm. The inlined version looks like this:

double px=0,py=0,pz=0;
double qx=0,qy=0,qz=0;
double tx=0,ty=0,tz=0;

px = ray.d.y * e2.z - ray.d.z * e2.y;
py = ray.d.z * e2.x - ray.d.x * e2.z;
pz = ray.d.x * e2.y - ray.d.y * e2.x;
double det = e1.x*px + e1.y*py + e1.z*pz;
if (det > -EPSILON && det < EPSILON)
    return false;
double recip = 1 / det;

tx = ray.o.x - o.x;
ty = ray.o.y - o.y;
tz = ray.o.z - o.z;

double u = (tx*px + ty*py + tz*pz) * recip;

if (u < 0 || u > 1)
    return false;

qx = ty * e1.z - tz * e1.y;
qy = tz * e1.x - tx * e1.z;
qz = tx * e1.y - ty * e1.x;

double v = (ray.d.x*qx + ray.d.y*qy + ray.d.z*qz) * recip;

if (v < 0 || (u+v) > 1)
    return false;

double t = (e2.x*qx + e2.y*qy + e2.z*qz) * recip;
return t < tNear;

The code certainly became less readable, but is it faster? The answer may depend on the JVM you use to run the code, but in most cases the above will NOT be faster. It can even be slower than the non-inlined version.

The Java language has no built-in way to explicitly allocate reference objects on the stack, as C++ does. However, the Java runtime can still just-in-time optimize your code to avoid unnecessary heap allocations. The Java JIT compiler tries to detect objects that don’t escape a method and allocate those on the stack instead. An object does not escape a method if it is not returned from the method, not assigned by reference in some other object, and not passed as an argument to some other method.

It can be difficult to figure out exactly when things are moved from heap to stack, but using the Caliper microbenchmark framework you can get a good idea. I benchmarked two versions of my intersection code against each other. One that used inlining and one that did not. It turned out that the one using inlining was consistently slower, though not by much, than the one not using manual inlining. The small difference in performance that occurred is probably caused by the way the JIT compiler happened to layout the code differently in each case, incidentally giving a slightly better cache coherence advantage to the non-manually-inlined version.

I also benchmarked the versions using the reciprocal multiplication optimization I discussed above and found that using this technique was beneficial, as originally expected.