How I Wrote my Thesis 2019-01-10

Next week I will defend my thesis, “Contributions to Declarative Implementation of Static Program Analysis”. A PDF of the thesis can be downloaded here. In this post I will describe the writing process and the tools I used to write my thesis.

What is it about?

My research contributions are all related to the development of static program analysis with reference attribute grammars. Among the main contributions are new algorithms for concurrent evaluation of circular reference attributes.

With reference attribute grammars it is possible to construct highly extensible compilers. I have been working for the past five years on a Java compiler named ExtendJ, and several of my research contributions have been implemented as extensions to ExtendJ.

How long did it take?

I spent about half of last year working on my thesis. However, I probably only spent about three months of effective writing time on the thesis. The rest of my time was spent on various related tasks like searching and reading related work, reviewing drafts, drawing graphs, administrative tasks, and procrastination. Writing ramped up toward the end of the writing period. The last couple of weeks leading up to the end, I slept poorly and did nearly no other activities than just writing and reading through my drafts multiple times.

My thesis is a collection of previously published papers. The thesis starts with an introductory chapter which gives a background to several topics discussed in the thesis as well as describing my research contributions. The rest of the thesis consists of the included papers and finally a short popular science summary in Swedish. The majority of my writing work went into the introductory chapter. Although I did not edit the text of the included papers I had to adjust tables, figures, and paragraph layout to make them fit in the book format of the thesis. For one of the papers I replaces four raster images with TikZ graphics.

LaTeX Makefile and automation

I wrote my thesis in LaTeX on Ubuntu. I used Chris Monson’s amazing Latex Makefile to compile my LaTeX code. This makefile was indispensable because it filters out all irrelevant output from LaTeX and only shows warnings and errors in the console, for example:

I had to make a small edit in the makefile so that it could work with Biber, but otherwise it worked very smoothly. To further make my LaTeX editing as smooth as possible I used entr to automatically recompile when I saved the LaTeX code.

The default document viewer on Ubuntu (Evince) automatically reloads a PDF if the file is changed on disk. When I made any change to the LaTeX code and saved the file, entr would automatically run the makefile and after a short delay the document viewer reloaded the PDF so that I could see the effect of my changes. I had the document viewer on one display and my LaTeX code on the other. My setup worked similarly to Overleaf, except that I edited the code in Vim and had full control of my own files and didn’t rely on a network connection.

TikZ and Forest

I spent a considerable amount of time designing figures and diagrams for the thesis. I wanted to mainly use TikZ to draw my diagrams. TikZ is a LaTeX package which can be used to draw figures directly in a TeX document as vector graphics. There are several advantages of using TikZ which motivated me to use it for the figures:

  • TikZ figures are scalable to any size without pixelation.
  • TikZ figures typically use relative positioning for specifying the locations of objects. When something changes size, the rest of the figure will be adjusted to maintain the relative distances. This is very useful when editing the text in the blocks in a flowchart, for example.
  • Text in TikZ figures is written with normal TeX code. This both makes the fonts in figures automatically consistent with the rest of the document and also allows the use of mathematical equations or any other kind of TeX formatting.

The major disadvantage of using TikZ is that figures are designed within the TeX language using an embedded DSL for describing diagrams. It can take a very long time to make a nice figure with TikZ. For example, it took three days for me to replace four raster images in one of the included papers by TikZ figures.

Illustrating trees is especially onerous in TikZ. However, I found the supremely useful Forest package which adds a DSL on top of TikZ for describing trees. With the Forest package it was very easy to draw nice trees for my thesis. The code for the trees was fairly minimal in most cases. For example, here is a small tree in Forest code:

\begin{forest}
[Program
  [Function
    [Type]
    [ID]
    [{Parameter$\ast$}]
    [Block
      [IfStmt
        [Expr]
        [Call]
        [Call]
      ]
    ]
  ]
]
\end{forest}

Here is the resulting figure from the above code:

The trees generated by Forest are TikZ objects and can be integrated into regular TikZ drawings with little effort. I used this when drawing arrows between nodes in a tree, among other things.

If you want to look at the TikZ code for the figures, the TeX code for my thesis is on my GitHub page. Some figures in the included papers are EPS or SVG instead of TikZ.

Conclusions

I am very happy to finally have finished my thesis. For the past half year or so I have largely neglected friends and family to focus on the thesis. I also constantly had a bad conscience about not progressing fast enough with the writing.

At the start of my PhD studies I felt like I would not have enough research results to write a full thesis. Doing good research is hard, and there is much more to it than just coming up with a new idea and implementing it. Implementing a new algorithm is the easy part, the hard part is to describe and evaluate it so that it becomes useful for other researchers.

Although I feel that there are some parts of my thesis which could have been improved, especially in the earlier papers, I am very pleased with it as is. I am especially pleased with the background sections about attribute grammars.

No Comments on How I Wrote my Thesis
Categories: Uncategorized

Is Email Dying? 2019-01-03

Is email dying? I have been thinking idly about this question for a while and reflecting on how much my use of my personal email inbox has decreased. I used to check my email inbox regularly, I even had it set up on my phone for a while so that I got notifications for new messages. Nowadays, however, I rarely check my personal email. Sometimes I don’t look at my personal inbox for over a week. When I do look at my inbox there are usually only messages that I’m not interested in like advertisements from stores I bought one item from a year ago, or a notification from Instagram that my aunt posted a picture of her dog.

One of the things email is really needed for on the internet is to sign up for accounts on websites. However, most popular websites now let you log in with your Facebook or Google account (or other OpenID providers). You don’t need an email address to get a Facebook or Google account, meaning that you can effectively use many account-based services online without having an email inbox.

Email is of course primarily a communication tool, but this purpose has, at least in part, been taken over by other services or apps such as Twitter, Facebook, Snapchat, Whatsapp, Discord, and Slack. Each of these offer better communication options than email for certain types of communication. Discord wins over email for informal conversations. Facebook and Twitter win for informal conversations to many people at a time. Email really wins for semi-formal conversations and for sending unsolicited messages, but eventually another app or service will replace even these types of needs.

I still have a lot of use for email in my work, but this too seems to be decreasing. Most important work-related stuff is handled by purpose-built websites. For example: booking rooms for meetings, applying for vacation leave, sharing documents, and deciding times for meetings. At work I mostly use email for irregular activities and communication with others outside my workplace.

I don’t believe that email will entirely disappear within the next thirty years, but I do think there may be a point when most new internet users will have no email accounts. I wonder how soon this will happen.

It will be sad when email dies. Email is really the only communication service on the internet where the users own their inbox. If you have email set up right, with an email client on your home computer, you can still read your email when your internet connection is interrupted. I think there is a lot of value in having the possibility to back up your own personal communication that is not reliant on a website to work correctly.

No Comments on Is Email Dying?
Categories: Uncategorized

Git Metrics Tool 2018-12-19

I wrote a small Python program to extract Git commit metrics for a table in my thesis. There were some alternative tools available, but none of them gave exactly the output I was looking for. Instead of converting the output from another tool it seemed easy enough to roll my own solution, so here it is: git-commit-metrics.

My program generates TeX tables like the following one:

The above table was generated for the ExtendJ compiler repository.

No Comments on Git Metrics Tool
Categories: Programming

Python Pitfall: Mutable Default Arguments 2018-09-04

I’ve been programming in Python lately, and I found an interesting problem with one small program I wrote. If I had less experience in programming languages I might have been stuck on this for a long time. The issue is well known in the Python community, but to a relative beginner like me it is easy to get caught off guard by it. The problem has to do with default function arguments being mutable and reused between function calls. It’s easiest to explain this with an example.

This is a very condensed version of my problematic program:

def add(M, key, value=[]):
    if key in M:
        M[key] += value
    else:
        M[key] = value

M={}
add(M, 1)
add(M, 2)
add(M, 3, ['X'])
add(M, 2, ['Y'])
print M

My goal was to build a map M of keys to lists of values. Sometimes I wanted to just insert a key without a value, so the function for adding key/value pairs has an empty list as default value. If the key already exists, I add all elements of the value to that key.

Running the code above, I expected it to print

{1: [], 2: ['Y'], 3: ['X']}

but instead I got this

{1: ['Y'], 2: ['Y'], 3: ['X']}

The reason the value lists for key 1 and 2 are the same is that they are in fact the same object. When calling the function without specifying an argument for the value parameter, the same default list value is reused. The problem is that this object is mutable and next time I call the function to update the value for key 2, it modifies the value of the default parameter which is already used by both key 1 and 2.

Here is another program demonstrating the same problem:

def cons(head, tail=[]):
    tail.insert(0, head)
    return tail

print cons(1, cons(2, cons(3)))
print cons('hmm')

Running this gives the output

[1, 2, 3]
['hmm', 1, 2, 3]

This is a well-known feature of Python, and there are lots of blog posts on the net warning about it about it, for example this one.

No Comments on Python Pitfall: Mutable Default Arguments
Categories: Uncategorized

Multiplying Polynomials with Fast Fourier Transform 2018-07-28

I recently learned a very strange way of multiplying polynomials.

The straightforward way of multiplying two polynomials of degree n takes O(n^2) time: multiply each term from one polynomial with each term from the other.

There is a clever algebraic optimization that has running time O(n^lg3). Explanations of that algorithm can be found here, here, and here.

Surprisingly, there is a much faster O(n log n) algorithm. Even more surprisingly, the algorithm solves the problem via a completely different mathematical domain by using the Discrete Fourier Transform (DFT). The algorithm works as follows: take the DFT of the input polynomials, point-wise multiply them together, then take the inverse DFT to get the result. Plain DFT is not fast enough, taking O(n^2) time, but the Fast Fourier Transform (FFT) can be used instead and it takes only time O(n log n). Point-wise multiplication takes time O(n), so the total running time is O(n log n)!

A detailed explanation of the FFT-based algorithm can be found in Introduction to Algoithms, Second Edition, by Cormen, Leiserson, Rivest, and Stein (Chapter 30). The best and most complete description I’ve read is the one from the book, but there are a few resources online: like these lecture notes which summarize/copy material from the book.

Fourier Transforms are normally used for digital signal processing applications (image, audio processing). The Fourier Transform breaks up a signal into its individual frequencies. Polynomial multiplication, on the other hand, is a form of combinatorial problem. Learning that we can use FFT to multiply polynomials makes me wonder if FFT can be used to speed up other problems in combinatorics.

No Comments on Multiplying Polynomials with Fast Fourier Transform
Categories: Uncategorized

Iterative Tarjan Strongly Connected Components in Python 2018-06-09

I recently needed to compute strongly connected components in graphs with Python, so I implemented Tarjan’s algorithm. The algorithm worked fine for small graph instances, but I needed to use it on graphs with up to 50000 vertices. Tarjan’s algorithm is recursive, and large graphs quickly caused “recursion depth exceeded” errors with Python. In the worst case, the algorithm needs 50000 stack activations, but Python has a fairly low maximum stack depth. When I tried to increase the recursion limit, as suggested in this SO post, this happened:

$ python b.py <chain.in
Segmentation fault: 11

The segmentation fault occurs because the Python process smashes its stack space. The process stack space is controlled by Linux and I did not want to mess with it.

The solution to these problems is to instead use an iterative algorithm. Fortunately, it was not too difficult to translate Tarjan’s algorithm into an iterative form. I just needed to add an extra stack and keep track of the next successor edge to visit for each vertex on the stack.

My implementations use adjacency lists to represent the graph (in the adj global). Here are my implementations. First, the standard recursive version:

next = 0 # Next index.
index = [None] * N
lowlink = [None] * N
onstack = [False] * N
stack = []
nextgroup = 0 # Next SCC ID.
groups = [] # SCCs: list of vertices.
groupid = {} # Map from vertex to SCC ID.

# Tarjan's algorithm.
def sconnect(v):
    global next, nextgroup
    index[v] = next
    lowlink[v] = next
    next += 1
    stack.append(v)
    onstack[v] = True
    for w in adj[v]:
        if index[w] == None:
            sconnect(w)
            lowlink[v] = min(lowlink[v], lowlink[w])
        elif onstack[w]:
            lowlink[v] = min(lowlink[v], index[w])
    if index[v] == lowlink[v]:
        com = []
        while True:
            w = stack[-1]
            del stack[-1]
            onstack[w] = False
            com.append(w)
            groupid[w] = nextgroup
            if w == v: break
        groups.append(com)
        nextgroup += 1

for v in xrange(N):
    if index[v] == None:
        sconnect(v)

The iterative version differs only in the sconnect subroutine.

# Tarjan's algorithm, iterative version.
def sconnect(v):
    global next, nextgroup
    work = [(v, 0)] # NEW: Recursion stack.
    while work:
        v, i = work[-1] # i is next successor to process.
        del work[-1]
        if i == 0: # When first visiting a vertex:
            index[v] = next
            lowlink[v] = next
            next += 1
            stack.append(v)
            onstack[v] = True
        recurse = False
        for j in range(i, len(adj[v])):
            w = adj[v][j]
            if index[w] == None:
                # CHANGED: Add w to recursion stack.
                work.append((v, j+1))
                work.append((w, 0))
                recurse = True
                break
            elif onstack[w]:
                lowlink[v] = min(lowlink[v], index[w])
        if recurse: continue # NEW
        if index[v] == lowlink[v]:
            com = []
            while True:
                w = stack[-1]
                del stack[-1]
                onstack[w] = False
                com.append(w)
                groupid[w] = nextgroup
                if w == v: break
            groups.append(com)
            nextgroup += 1
        if work: # NEW: v was recursively visited.
            w = v
            v, _ = work[-1]
            lowlink[v] = min(lowlink[v], lowlink[w])
No Comments on Iterative Tarjan Strongly Connected Components in Python
Categories: Uncategorized