Python Pitfall: Mutable Default Arguments 2018-09-04

I’ve been programming in Python lately, and I found an interesting problem with one small program I wrote. If I had less experience with programming languages I might have been stuck on this for a long time. This issue is very well-known in the Python community, but to a newbie like me it is easy to get caught off-guard by it. The problem has to do with default function arguments being mutable and reused between function calls. It’s easiest to explain this with an example.

This is a very condensed version of my problematic program:

def add(M, key, value=[]):
    if key in M:
        M[key] += value
    else:
        M[key] = value

M={}
add(M, 1)
add(M, 2)
add(M, 3, ['X'])
add(M, 2, ['Y'])
print M

My goal was to build a map M of keys to lists of values. Sometimes I wanted to just insert a key without a value, so the function for adding key/value pairs has an empty list as default value. If the key already exists, I add all elements of the value to that key.

Running the code above, I expected it to print

{1: [], 2: ['Y'], 3: ['X']}

but instead I got this

{1: ['Y'], 2: ['Y'], 3: ['X']}

The reason the value lists for key 1 and 2 are the same is that they are in fact the same object. When calling the function without specifying an argument for the value parameter, the same default list value is reused. The problem is that this object is mutable and next time I call the function to update the value for key 2, it modifies the value of the default parameter which is already used by both key 1 and 2.

Here is another program demonstrating the same thing:

def cons(head, tail=[]):
    tail.insert(0, head)
    return tail

print cons(1, cons(2, cons(3)))
print cons('hmm')

Running this gives the output

[1, 2, 3]
['hmm', 1, 2, 3]

This is a well-known feature of Python, and there are lots of blog posts on the net warning about it about it, for example this one.

No Comments on Python Pitfall: Mutable Default Arguments
Categories: Uncategorized

Multiplying Polynomials with Fast Fourier Transform 2018-07-28

I recently learned a very strange way of multiplying polynomials.

The straightforward way of multiplying two polynomials of degree n takes O(n^2) time: multiply each term from one polynomial with each term from the other.

There is a clever algebraic optimization that has running time O(n^lg3). Explanations of that algorithm can be found here, here, and here.

Surprisingly, there is a much faster O(n log n) algorithm. Even more surprisingly, the algorithm solves the problem via a completely different mathematical domain by using the Discrete Fourier Transform (DFT). The algorithm works as follows: take the DFT of the input polynomials, point-wise multiply them together, then take the inverse DFT to get the result. Plain DFT is not fast enough, taking O(n^2) time, but the Fast Fourier Transform (FFT) can be used instead and it takes only time O(n log n). Point-wise multiplication takes time O(n), so the total running time is O(n log n)!

A detailed explanation of the FFT-based algorithm can be found in Introduction to Algoithms, Second Edition, by Cormen, Leiserson, Rivest, and Stein (Chapter 30). The best and most complete description I’ve read is the one from the book, but there are a few resources online: like these lecture notes which summarize/copy material from the book.

Fourier Transforms are normally used for digital signal processing applications (image, audio processing). The Fourier Transform breaks up a signal into its individual frequencies. Polynomial multiplication, on the other hand, is a form of combinatorial problem. Learning that we can use FFT to multiply polynomials makes me wonder if FFT can be used to speed up other problems in combinatorics.

No Comments on Multiplying Polynomials with Fast Fourier Transform
Categories: Uncategorized

Iterative Tarjan Strongly Connected Components in Python 2018-06-09

I recently needed to compute strongly connected components in graphs with Python, so I implemented Tarjan’s algorithm. The algorithm worked fine for small graph instances, but I needed to use it on graphs with up to 50000 vertices. Tarjan’s algorithm is recursive, and large graphs quickly caused “recursion depth exceeded” errors with Python. In the worst case, the algorithm needs 50000 stack activations, but Python has a fairly low maximum stack depth. When I tried to increase the recursion limit, as suggested in this SO post, this happened:

$ python b.py <chain.in
Segmentation fault: 11

The segmentation fault occurs because the Python process smashes its stack space. The process stack space is controlled by Linux and I did not want to mess with it.

The solution to these problems is to instead use an iterative algorithm. Fortunately, it was not too difficult to translate Tarjan’s algorithm into an iterative form. I just needed to add an extra stack and keep track of the next successor edge to visit for each vertex on the stack.

My implementations use adjacency lists to represent the graph (in the adj global). Here are my implementations. First, the standard recursive version:

next = 0 # Next index.
index = [None] * N
lowlink = [None] * N
onstack = [False] * N
stack = []
nextgroup = 0 # Next SCC ID.
groups = [] # SCCs: list of vertices.
groupid = {} # Map from vertex to SCC ID.

# Tarjan's algorithm.
def sconnect(v):
    global next, nextgroup
    index[v] = next
    lowlink[v] = next
    next += 1
    stack.append(v)
    onstack[v] = True
    for w in adj[v]:
        if index[w] == None:
            sconnect(w)
            lowlink[v] = min(lowlink[v], lowlink[w])
        elif onstack[w]:
            lowlink[v] = min(lowlink[v], index[w])
    if index[v] == lowlink[v]:
        com = []
        while True:
            w = stack[-1]
            del stack[-1]
            onstack[w] = False
            com.append(w)
            groupid[w] = nextgroup
            if w == v: break
        groups.append(com)
        nextgroup += 1

for v in xrange(N):
    if index[v] == None:
        sconnect(v)

The iterative version differs only in the sconnect subroutine.

# Tarjan's algorithm, iterative version.
def sconnect(v):
    global next, nextgroup
    work = [(v, 0)] # NEW: Recursion stack.
    while work:
        v, i = work[-1] # i is next successor to process.
        del work[-1]
        if i == 0: # When first visiting a vertex:
            index[v] = next
            lowlink[v] = next
            next += 1
            stack.append(v)
            onstack[v] = True
        recurse = False
        for j in range(i, len(adj[v])):
            w = adj[v][j]
            if index[w] == None:
                # CHANGED: Add w to recursion stack.
                work.append((v, j+1))
                work.append((w, 0))
                recurse = True
                break
            elif onstack[w]:
                lowlink[v] = min(lowlink[v], index[w])
        if recurse: continue # NEW
        if index[v] == lowlink[v]:
            com = []
            while True:
                w = stack[-1]
                del stack[-1]
                onstack[w] = False
                com.append(w)
                groupid[w] = nextgroup
                if w == v: break
            groups.append(com)
            nextgroup += 1
        if work: # NEW: v was recursively visited.
            w = v
            v, _ = work[-1]
            lowlink[v] = min(lowlink[v], lowlink[w])
No Comments on Iterative Tarjan Strongly Connected Components in Python
Categories: Uncategorized

Competitive Coding 2018-03-27

The past few weeks I have been solving problems on the online competitive coding judge Kattis. Kattis is a site where you can create an account and solve problems for points. You submit your code to Kattis, then Kattis compiles/runs it to check it against hidden test cases.

The problems vary in difficulty from trivial to very hard. I have done mostly hard ones so far, though that is not effective for gaining points. I tend to get stuck on the hard problems, and then I can’t let the problem go until I have solved it. Sometimes it takes me a few days before finally finding a solution to a hard problem. During this time it is tempting to constantly think about the problem even when I’m not actively working on it. This process often makes me feeling stupid and frustrated, though the feeling of relief when I finally solve a difficult problem is incredible and addictive.

I am not really interested in the competitive part of competitive programming. I mostly think it is fun to solve new and interesting problems. Fundamentally, I think that problem solving is what makes programming fun. By solving these problems I might also become better at solving other problems that arise in my normal programming work.

An example of a problem that I was stuck on for a long time is Around the Track, in which you have to find the shortest looping path between two polygons.

To aid in debugging my solution, I made a program that renders an image for the computed shortest path. Below is one such image (the red path is the shortest path, the inside of the blue area is the outer polygon, and the outside of the green area is the inner polygon).

No Comments on Competitive Coding
Categories: Uncategorized

LuxColorPicker: a better JavaFX color picker 2018-02-19

LuxColorPicker is a simple HSV color picker for JavaFX with a highly usable interface. The main features of LuxColorPicker are:

  • Only one color selection mode: Hue/Saturation/Value.
  • Large components improve usability.
  • Minimalistic design: no unnecessary components or text to distract users.

LuxColorPicker is easy to integrate in new JavaFX projects. The code is at GitHub.

This is a screenshot of the color picker in action:

Why LuxColorPicker?

JavaFX already has a built-in color picker component, so you might wonder why LuxColorPicker is needed. In my opinion the standard JavaFX color picker is functional but flawed: I don’t like its default color selection mode, and it has unnecessarily small components which make it fiddly to use.

If you agree with these points, then LuxColorPicker might be the right color picker for you.

Only HSV

Unlike many other color pickers, LuxColorPicker has only one color selection mode: Hue/Saturation/Value (HSV). In my opinion the HSV mode is the best selection mode, making all others useless.

Large Components

Small interface components are fiddly and less visible than large components. The smaller a component is, the longer it will take to accurately hit with the mouse (unless the component touches the edge of the screen). This is not my opinion, it’s UX science.

In LuxColorPicker, all components are oversized compared to most other color pickers. This not only makes the components easier to use, but also increases the visibility of the colors on display. In particular the hue selection bar is much larger than I have seen in other programs. I made the hue bar this large because it helps in making the available hues clearly visible.

The preview color swatch is large for the same reason: it is important to clearly see the current selected color.

Usability

There is a “Save” button to confirm the current color selection, but the user does not need to click it. LuxColorPicker is non-modal and focus-shy: simply clicking outside the color picker closes it and confirms the selection. Pressing escape cancels the current selection and closes the color picker. The buttons are there just for users who have not yet learned the optimal interactions.

History and Random Swatches

The only nonessential embellishments in LuxColorPicker are the color swatches below the preview swatch. These are the history and random swatches. The random swatches show random colors close to the current selected color, and the history swatches contain the previously selected colors.

The random colors are useful when you don’t know exactly what color you want to pick – maybe one of the random colors will look nice.

No Comments on LuxColorPicker: a better JavaFX color picker
Categories: Uncategorized

The Flattening of Minecraft 2018-02-10

Minecraft is undergoing a major internal change called The Flattening. I decided to write this overview and explanation of The Flattening because I think it is interesting from a programming and system design perspective. I hope you find it interesting, too!

Block IDs

To explain what The Flattening means, I first have to describe how the first versions of Minecraft represented blocks in the world. As you probably know, the game world in Minecraft is made up of blocks like stone, grass, and dirt. Even air is represented as a type of block.

In the beginning of Minecraft development, Notch decided to represent each type of block by integer IDs. Air was given ID zero, stone was one, grass two, dirt three, and so on. This worked great, because there were not that many different kinds of blocks. Notch decided to use a one-byte encoding for blocks, meaning that 256 different block types (including air) could exist in the game.

Chunks

A chunk, in Minecraft, is a vertical slice of the gameworld. Chunks contain 16x16x256 blocks each in current Minecraft versions. Previously, chunks were only 128 blocks tall.

Blocks are stored in a chunk as an array of IDs. In Minecraft Alpha 1.1, chunks contained 32k blocks, so the total block data in a chunk took 32kB of memory. However, when stored on disk, chunks were compressed to about 5kB or less using the DEFLATE algorithm.

Textures

To draw different blocks, different textures are needed. In Minecraft Alpha 1.1, textures were stored in a texture atlas. Several textures are packed together into one large texture, and then parts of the large texture are used to draw different types of game objects.

By looking at the texture atlas, we can estimate the number of blocks types in Minecraft Alpha 1.1 to somewhere around 60. Notice that some blocks like gold blocks and chests take up multiple texture squares.

There are several advantages of using texture atlases. An incidental advantage is that you can make a texture pack for the game by just replacing one file. That’s exactly what the first Minecraft modders did to change the look of the game.

Of course, the texture atlas was not going to last. It only had room for 256 textures, and to make matters worse, many blocks needed multiple textures. Since Minecraft 1.5 the terrain.png file was completely replaced by individual texture files. By that point, Minecraft already had built-in support for loading custom texture packs (later known as resource packs).

Block Data and TileEntities

The first blocks in Minecraft only had one state each: stone was always just stone, dirt could be covered by grass but that was treated as a separate kind of block. However, Notch eventually added blocks that could have different states. For example, the “Wheat” block is a growing wheat field, with seven different states depending on how ripe the wheat is.

The wheat block could have been represented as seven different blocks: one for each growth stage. That would have exhausted many block IDs, so instead Notch added a separate array of data for the blocks in a chunk. Notch figured that 4 bits of data would be enough. It doesn’t seem like much, but that essentially adds 50% more data to the basic chunk format.

The block ID range was at some point extended with an optional 4 bits, giving 4096 total possible block types. Still all the default Minecraft blocks were represented as single bytes. I’m not sure what the extended ID range was for. It may have been for mod blocks.

Some kinds of data were not at all well suited to store in a fixed-width format. For example: signs can have custom text, armor stands can hold many combinations of items, etc. This new data was stored in something called the TileEntities structure in the chunk. TileEntities did not have a size limit: it could store as much extra information as needed.

To summarize, here is all data used to represent the blocks of a chunk in the current Minecraft release (1.12):

  • Blocks – block IDs (4096 bytes)
  • Add (optional) – additional block data (4 bits per block: 2048 bytes)
  • Data – block state information (4 bits per block: 2048 bytes)
  • TileEntities – additional block data

The block data uniquely identifies a single block, mapping to a fixed set of block IDs:

The Flattening

With all that background information I can now explain The Flattening. The main change is that the fixed block IDs are being removed. Instead, blocks will be defined by identifiers like “minecraft:stone”, “minecraft:dirt”, etc. The “miencraft:” prefix is to identify the default Minecraft blocks. With these free-form identifiers there is an unlimited amount of possible block types, and this opens up for mods to add as many block types as they wish.

The new block identifiers are augmented with state information in a format like the TileEntities structure discussed above. This state information does not have a size limit.

The chunk format was changed so that each chunk has its own block palette (technically, each 16x16x16 section has a palette). I tried to illustrate it with this drawing:

The idea with the palette is similar to that of palettes in image compression. The palette contains a fixed number of entries, which can be indexed by a fixed-size number.

If there are multiple variants of the same block type with different state, then each different set of block states is represented with a separate palette entry.

The new format is clever, because now there can exist a completely ludicrous amount of different types of blocks and state combinations in the game, but only the blocks that exist in a chunk will be stored in the palette. As far as I have seen, untouched generated chunks often need fewer than 16 different palette entries. Chunks that have player-made structures will generally use more combinations of blocks and block states.

World Size

An additional detail about the new format is that chunks with 16 or fewer kinds of blocks (including state variations) need 16 or fewer palette indexes, which can be stored in 4 bits per block. Even if the palette is larger, the palette indexes are often less than 8 bits per block (the number of bits per block is the 2-logarithm of the number of palette entries, rounded up). This in itself should save a fair bit of space. In addition, the 4 bits of block state per block was removed and the optional 4 bits of block ID data is no longer needed.

I don’t know what the total change in world size is for typical worlds, but the palette system might save some space overall.

1 Comment on The Flattening of Minecraft
Categories: Uncategorized