Git Metrics Tool 2018-12-19

I wrote a small Python program to extract Git commit metrics for a table in my thesis. There were some alternative tools available, but none of them gave exactly the output I was looking for. Instead of converting the output from another tool it seemed easy enough to roll my own solution, so here it is: git-commit-metrics.

My program generates TeX tables like the following one:

The above table was generated for the ExtendJ compiler repository.

No Comments on Git Metrics Tool
Categories: Programming

The JavaFX Rewrite 2016-09-14

I have been working on rewriting the Chunky UI using JavaFX. After a few months of work I am nearly done with the rewrite and it has resulted in removing more than 7 thousand lines of code. That’s close to 15% of the total source code removed!

During the JavaFX rewrite i changed much more than just the UI code. In this post I will summarize the major things that I rewrote or refactored as part of the JavaFX rewrite.

1 Comment on The JavaFX Rewrite
Categories: Chunky Programming

My thoughts on Gradle 2016-06-20

A while ago I wrote about why I prefer short build scripts and I briefly mentioned Gradle, a build system for Java. Today I’d like to share more of my thoughts on Gradle. I’ve been using it for most of my projects in the last couple of years. I wanted to summarize my thoughts here both for personal reflection and future reference, and hopefully I’ll help you decide if you want to use Gradle too.

I’ll mostly be comparing Gradle to two other build systems for Java: Ant and Maven. From personal experience these are the two most common build systems for Open Source Java projects. It seems like Maven is more common nowadays, but Ant used to be the de facto standard for building Java projects. I think Gradle is growing in popularity, especially since it is part of the official Android toolchain.

Things I like about Gradle

I have mostly positive things to say about Gradle, but I also have some complaints or nits that I think could be improved. Let’s start with the stuff I like in Gradle.

Goodbye XML

My favorite feature of Gradle is that it uses Groovy as the language for build scripts. Java build systems have a long tradition of XML-based build scripts (Ant & Maven), and I’m incredibly happy to use something that is not XML-based. XML may be nice for technical reasons but it sucks so bad to hand-write XML.

Thanks to being Groovy-based, Gradle has a nice and sparse syntax and it is even possible to write general purpose Groovy code inside the build script, allowing complex actions and decisions to be integrated in the build.

General purpose code in the build

Writing general purpose code in the build script is very useful because it allows adding any of the following complex high-level behaviors to the build:

  • Complex one-off project-specific build tasks.
  • Complex up-to-date checks for build tasks.
  • Environment-dependent build configuration.

Gradle makes it very simple to construct custom build tasks, making it easy to handle build steps that there may not already exist a plugin for. In comparison, you’d have to write your own one-off plugin to do something similar with Ant or Maven, a much larger investment for something that you’ll likely only use for one project.

Custom up-to-date checks are very useful for making a build more incremental. Gradle already has nicer incremental build support and improved dependency handling compared to Ant, but when using custom build tasks it may be useful to write your own up-to-date checks to decide when a rebuild is necessary in order to maximize incrementality.

Finally, I mentioned environment-dependent build configuration. This is more of a niche use case but one that I have found very useful in my projects. I usually generate version names using git describe, which gives nice incremental version names for Git projects. To generate such version strings in Gradle I use a common snippet of general-purpose code that I add in my build scripts. It looks something like this (edited for brevity):

// Runs 'git describe' and returns the output as a string.
def getVersion() {
    def output = tryCommand(['git', 'describe'])
    // Update resource files...
    output.readLines()[0] // Return the first line of output.

project.version = getVersion()

The getVersion() function above runs git describe to generate a version string. The function also stores the version string in a property file and does some error handling in case the git describe command failed.

Build by convention

Build by convention is not a new idea. I believe the concept was popularized by Maven, and Gradle has adopted the same conventions. Briefly, the idea of build by convention is this: if source files are stored in standard directories then you do not have to configure them in the build script.

Java projects use the following conventions: Java source files are stored in src/main/java, and test files are stored in src/test/java. If you follow this convention you’ll only need to use the Java plugin in Gradle to set up a new Java project:

apply plugin: 'java'

That’s it, a complete build script for a simple Java project. Getting started with your next project should be at least this simple.

Maven dependencies

In my eyes the best thing Maven gave the Open Source Java community was the Maven Central Repository. It’s a database of Java libraries which can be downloaded by Maven when building a project. If you don’t want to keep large Jar files in your repository you can add dependencies to the Central Repository and rely on Maven to fetch the libraries when the project is built.

The Maven Central Repository became hugely popular for Open Source Java projects, with most projects favoring the Central Repository over storing libraries in their own source repository. Support for Maven dependencies has since spread to many other build systems that can build Java or Scala applications.

Gradle also supports Maven dependencies natively. With just a few extra lines in your build script you can build with a Maven library:

apply plugin: 'java'
apply plugin: 'maven'
repositories {
dependencies {
    test 'junit:junit:4.12'

The Maven Central Repository has become an invaluable tool for simplifying Open Source development, it reduces the manual work needed to set up a new project using Open Source libraries. Many useful plugins for Gradle itself are even hosted in the Central Repository.

I love using Maven dependencies in Gradle, and it really feels like I’ve emerged from the dark ages of using Ant into the glorious future of Java building with Gradle using Maven dependencies.

Build daemon

Since a few releases back, Gradle comes with a daemon that runs in the background – it’s like a little build server that runs on your machine and waits for build commands. The Gradle daemon is a huge win for Gradle users, because without it, Gradle has a pretty long cold-start time – very noticeable and somewhat annoying. When using the daemon my builds start instantly, and the build over all goes much faster!

Before the Gradle daemon was released my Ant builds used to be much faster than my Gradle builds, but now my Gradle builds are faster than my Ant builds.

Portable build

In 2014 I was collaborating remotely with a professor in Germany and implementing a demonstration compiler for a language feature which the professor had invented. I wrote the compiler in Java, using JastAdd, and I was working on Linux. The professor was working on Windows, and didn’t have Gradle installed.

When I sent my code to the professor so he could try it, I had to make it as simple as possible to compile and run the code to remove any friction and not waste his time. Gradle to the rescue! With Gradle you can generate scripts that bootstrap the build system on most Windows or Linux/Mac platforms. It worked flawlessly, and really helped in the collaboration.

These cross-platform bootstrap scripts are called wrapper scripts. They download Gradle and make it run and build your project. These wrapper scripts are excellent for sharing a project with people who may not have Gradle installed, or who may have an older version of Gradle that can’t build the project.

Things that could be improved

I do like Gradle a lot, but there are a couple issues that I think could be improved.

Obscured functionality

Although I’ve had some positive things to say about the brevity of Gradle build scripts above, I think the terseness can be problematic in some cases. The main issue I have with the way Gradle scripts work is that they hide very much of what’s going on under the surface. It takes a long time to grok what really happens, at least if you learn Gradle mostly by using it rather than by reading the documentation – my preferred way of stumbling my way through new tech.

One example of an obtuse construct is appending actions to a build task. For example, you could declare a build task like this:

task buildHouse << { doWork() }

Alternatively, you could accomplish the same thing like this:

task buildHouse {
    doLast { doWork() }

The second form explains much more about what is happening. The first form relies on you knowing what the << operator stands for. I quickly learned what it meant, but nonetheless there was a period where I thought << was a thing you always had to use. Later on I learned that there existed doLast and when I knew that doLast existed, it was easy to discover doFirst.

Using default behavior is a double edged sword – it makes everything very compact and neat for the experienced user, but it hinders discoverability for new users.

Gradle hides a lot of other defaults under the surface. When you get started you don’t have a clue about what is really happening and often you don’t need to know, but when you want to start making more advanced build rules and defining your own up-to-date checks you really need to know what’s happening and in what order things are happening. One gotcha I learned is that anything enclosed in braces is a closure, and closures can be used to do lazy evaluation thereby solving some issues caused by inadvertently using configuration variables before they have been properly configured.

Learning more advanced Gradle usage and writing my own plugins felt needlessly difficult due to obscure semantics. Part of this obscurity is inherited from Groovy which in itself has some pretty complicated invisible behavior, such as the name lookup / delegation semantics.

Multiple configuration files

The main build script in Gradle is named build.gradle, but there are also two other files involved that you will have to edit them when you start making more advanced builds: settings.gradle,

It would be nice if everything project-specific could be consolidated into one file. Sometimes there is a need for non project specific configuration, but that should just require one extra file.

No implicit dependency on the build script

A minor issue I have with Gradle is that it does not track changes to the build script itself. If I edit a build task it would be reasonable to re-run that task on the next build, because the edits probably have affected the output of the build task.

Stupid logo

My last complaint is very non-technical and petty, but I think Gradle has a stupid logo. Gradle’s logo depicts an angry elephant. Why have an angry elephant as your logo when your motto is “Build happiness”? Gradle have even addressed the issue and noted that people will misunderstand this on their own blog, saying “You might find it a bit odd to see him swearing if our motto is ‘Build Happiness'”.

You are right it is odd, it doesn’t make sense. They go on to say that developers are often frustrated with their builds, and they call this phenomenon “Build Hell”, and the elephant is angry at Build Hell.

However, when I see that logo, I am reminded of unhappy moments caused by frustrating build problems. Do you want me to associate Gradle with Build Hell? You should want me to associate it with happiness. The elephant should be happy.

If you know that the logo is confusing, then the logo is bad. Logos must be as unconfusing as possible, otherwise they fail at their job of communicating your goals. Trying to be clever doesn’t work: if it’s not intuitive nobody gets your message.


1 Comment on My thoughts on Gradle
Categories: Programming

Coding Zen: Don’t Catch Exceptions 2016-05-12

Usually the right thing to do with exceptions is to not catch them. This may seem counter-intuitive, so I will try to describe why I think it is a sound idea.

Consider this piece of Python code:

    print "Unhandled exception"

The code above prints an error message and halts execution if an exception occurs. This is a pretty common pattern, after all it is usually not fun to have stack traces printed on the console, which is the default behaviour when an exception goes uncaught. For example, here is the stack trace Python prints for the exception thrown by pressing Ctrl+C:

^CTraceback (most recent call last):
  File "", line 6, in <module>
  File "", line 4, in work

The stack trace looks scary, like something broke horribly. For that reason we don’t want to expose stack traces to users, so it’s common to avoid the stack trace by using the previous code pattern to catch all exceptions.

I will now try to show why the above pattern, and the idea of eagerly catching exceptions, is harmful.

Obscuring error causes

When an exception is caught the responsibility to inform the user of what went wrong is on the code that catches the exception. If the catching code prints a generic error message we won’t know what caused the error. The worst thing you can do is to not print anything at all – that will leave the user not knowing if anything went wrong and that leads to a more frustrating experience than if they clearly know something went wrong.

Even if the exception catching code prints an error, you have to include enough information to be able to debug the error later. If your code does a lot of stuff which may cause exceptions, how do you know which part caused the program to stop and print “Unhandled exception”?

Catching exceptions with too large nets causes problems by obscuring the cause of errors in the code. The code below is harder than necessary to debug:

except Exception as e:
    print e

Even though we print the exception it may be difficult to track down the cause. The print statement only prints the exception message, not the location it was thrown. For example, if there is a divide-by-zero error somewhere in work(), or the code called by work(), then the above code prints out integer division or modulo by zero. That’s not very nice, especially when you get this in a bug report from a user. How will you track down the location? It was probably a bug that triggers seldom, under conditions that don’t happen in your test environment. It would be much more useful to get the stack trace which clearly includes the location of the error:

Traceback (most recent call last):
  File "", line 8, in <module>
  File "", line 4, in work
ZeroDivisionError: integer division or modulo by zero

If your program will be run by non-technical users it might be a good idea to write the stack trace to a file which they can attach in a bug report. Catching exceptions so that they are logged with the stack trace is a fine approach, but the key idea here is that exceptions should be allowed to propagate up to a handler that does more than just say “something went wrong”.

Don’t catch all exceptions

In most cases, it is not worth catching exceptions. Rather than having to convince yourself that you should not catch an exception, you should convince yourself that you should catch it. This mindset is important even in languages that enforce stricter handling of exceptions, like Java:

private void work() throws IOException { ... }
void doSomething() { work(); }

Should the doSomething() method above declare IOException to be thrown, or should it handle the exception? The compiler forces you to pick one. I would argue that you should let the exception be thrown, and not catch it, unless you have a good reason to catch the exception.

Which suggested fix would you choose?

In Java, unfortunately, programmers are conditioned to handle the exception locally because adding a new exception to a method signature often means that we just have to make the same decision in potentially multiple other locations in the code. Thus, sloppy exception handling is common in Java. I’ve often seen code that just adds a catch clause to get rid of a compile error.

I have had to solve many bugs in Java code, and some of those bugs were needlessly obscured by over-eager exception catching. The problem is not exclusive to Java, but I think it is more common in Java than Python.

When to handle exceptions

There do exist good reasons to catch exceptions, I just mean that we should convince ourselves that we really need to catch the exception before we blindly add code to catch it. Writing code without thinking it through is pointless busywork, and often leads to stupid mistakes. I notice myself falling in this trap constantly.

One good reason to catch an exception is if the operation that caused the exception should be retried. This is only feasible for certain combinations of operations and exceptions, so in those cases only specific exceptions should be caught.

Two examples of operations that can/should be retried are:

  • Downloading a file when an network error occurs.
  • Asking a user for a password and it was incorrect.

Another good reason to catch an exception is if the exception is benign or if the operation was an optional step that is safe to skip. Again, it is important to make sure that only the specific exception to be suppressed is caught.

It is not a good idea to catch more exceptions than needed, and the worst case is to indiscriminately catch all exceptions. Catch only the ones that you know you want to handle!

Let’s go back to the first example with the Python code and for instance say that we want to not print the stack trace when the user hits Ctrl+C – the program should just silently exit. That’s fine, but keep in mind that it is important to ensure that we don’t hide serious errors by catching too many exceptions. The following code solves the problem by catching only KeyboardInterrupt exceptions while letting other exceptions pass through:

except KeyboardInterrupt:
    print "Unhandled exception"

Providing additional info

It is sometimes useful to provide some additional context for an exception by catching it and then re-throwing it. For example:

    print "Something went wrong while doing the work"

This is fine, the exception handler may even provide a useful bit of contextual information that can help later debugging. I don’t really think of this as catching an exception though, since it is re-thrown at the end of the exception handling code.

The Zen

The final Zen of the matter with exceptions is that the default should be to not handle exceptions, and only catch the ones that there is a specific reason to catch.

Not catching exceptions can simplify your code. It removes a few lines for exception handlers, simplifies the control flow, and gives peace of mind that some bugs are not needlessly obscured.

No Comments on Coding Zen: Don’t Catch Exceptions
Categories: Programming

LR Parser Generator 2014-10-11

A few weeks ago I decided that I wanted to write a parser generator. This is one of those projects I’ve wanted to do for a long time but never got started on. When I finally started I decided that it would be fun to learn Scala at the same time, so I wrote the parser generator in Scala.

One of the things that motivated me to get started writing a parser generator is some ideas I have about how parser generator diagnostic messages can be improved. Currently, LR parser generators provide only the most limited information about a shift-reduce or reduce-reduce conflict. For example, the Beaver parser generator can give you something like this:

Warning: Resolved Shift-Reduce conflict by selecting (PLUS: SHIFT; goto 3) over (PLUS: REDUCE E = E PLUS E) using precedence.

It can take a lot of time, ever for an expert, to figure out what the problem is and how to fix the grammar to remove the conflict. I think this can be significantly improved.

Currently my parser generator only generates an LR(0) parser, though my goal is to make it generate LR(1) parsers. The Scala code became surprisingly compact. For example, here is the main parse loop of the parser after the action and goto tables have been generated:

def parse(in: LookaheadBuffer[Symbol], goal: ItemSet) {
  val state = new scala.collection.mutable.Stack[ItemSet]
  val stack = new scala.collection.mutable.Stack[ParseTree]
  state push goal
  while (true) {
    while (reduceRules contains {
      val rule = reduceRules(
      val lhs = rule.lhs
      var children = List[ParseTree]()
      for (i <- 1 to rule.rhs.size) {
        children = children :+ stack.pop
      stack push new ParseTree(lhs, children)
      state push map( goto
    val sym = if (!in.eof) in.pop else EOFToken
    if (sym != WhitespaceToken) {// skip whitespace
      val action = actions(
      action match {
        case Accept =>
        case shift: Shift =>
          stack push new ParseTree(sym)
          state push

I wouldn’t be surprised if there was some way to shave ten more lines off that loop.

No Comments on LR Parser Generator
Categories: Programming Scala

Rendering Bugs 2014-04-14

It’s not supposed to look like this:

Rendering bug

Rendering bug

No Comments on Rendering Bugs
Categories: Programming