Minecraft is undergoing a major internal change called The Flattening. I decided to write this overview and explanation of The Flattening because I think it is interesting from a programming and system design perspective. I hope you find it interesting, too!
To explain what The Flattening means, I first have to describe how the first versions of Minecraft represented blocks in the world. As you probably know, the game world in Minecraft is made up of blocks like stone, grass, and dirt. Even air is represented as a type of block.
In the beginning of Minecraft development, Notch decided to represent each type of block by integer IDs. Air was given ID zero, stone was one, grass two, dirt three, and so on. This worked great, because there were not that many different kinds of blocks. Notch decided to use a one-byte encoding for blocks, meaning that 256 different block types (including air) could exist in the game.
A chunk, in Minecraft, is a vertical slice of the gameworld. Chunks contain 16x16x256 blocks each in current Minecraft versions. Previously, chunks were only 128 blocks tall.
Blocks are stored in a chunk as an array of IDs. In Minecraft Alpha 1.1, chunks contained 32k blocks, so the total block data in a chunk took 32kB of memory. However, when stored on disk, chunks were compressed to about 5kB or less using the DEFLATE algorithm.
To draw different blocks, different textures are needed. In Minecraft Alpha 1.1, textures were stored in a texture atlas. Several textures are packed together into one large texture, and then parts of the large texture are used to draw different types of game objects.
By looking at the texture atlas, we can estimate the number of blocks types in Minecraft Alpha 1.1 to somewhere around 60. Notice that some blocks like gold blocks and chests take up multiple texture squares.
There are several advantages of using texture atlases. An incidental advantage is that you can make a texture pack for the game by just replacing one file. That’s exactly what the first Minecraft modders did to change the look of the game.
Of course, the texture atlas was not going to last. It only had room for 256 textures, and to make matters worse, many blocks needed multiple textures. Since Minecraft 1.5 the terrain.png file was completely replaced by individual texture files. By that point, Minecraft already had built-in support for loading custom texture packs (later known as resource packs).
Block Data and TileEntities
The first blocks in Minecraft only had one state each: stone was always just stone, dirt could be covered by grass but that was treated as a separate kind of block. However, Notch eventually added blocks that could have different states. For example, the “Wheat” block is a growing wheat field, with seven different states depending on how ripe the wheat is.
The wheat block could have been represented as seven different blocks: one for each growth stage. That would have exhausted many block IDs, so instead Notch added a separate array of data for the blocks in a chunk. Notch figured that 4 bits of data would be enough. It doesn’t seem like much, but that essentially adds 50% more data to the basic chunk format.
The block ID range was at some point extended with an optional 4 bits, giving 4096 total possible block types. Still all the default Minecraft blocks were represented as single bytes. I’m not sure what the extended ID range was for. It may have been for mod blocks.
Some kinds of data were not at all well suited to store in a fixed-width format. For example: signs can have custom text, armor stands can hold many combinations of items, etc. This new data was stored in something called the TileEntities structure in the chunk. TileEntities did not have a size limit: it could store as much extra information as needed.
To summarize, here is all data used to represent the blocks of a chunk in the current Minecraft release (1.12):
- Blocks – block IDs (4096 bytes)
- Add (optional) – additional block data (4 bits per block: 2048 bytes)
- Data – block state information (4 bits per block: 2048 bytes)
- TileEntities – additional block data
The block data uniquely identifies a single block, mapping to a fixed set of block IDs:
With all that background information I can now explain The Flattening. The main change is that the fixed block IDs are being removed. Instead, blocks will be defined by identifiers like “minecraft:stone”, “minecraft:dirt”, etc. The “miencraft:” prefix is to identify the default Minecraft blocks. With these free-form identifiers there is an unlimited amount of possible block types, and this opens up for mods to add as many block types as they wish.
The new block identifiers are augmented with state information in a format like the TileEntities structure discussed above. This state information does not have a size limit.
The chunk format was changed so that each chunk has its own block palette (technically, each 16x16x16 section has a palette). I tried to illustrate it with this drawing:
The idea with the palette is similar to that of palettes in image compression. The palette contains a fixed number of entries, which can be indexed by a fixed-size number.
If there are multiple variants of the same block type with different state, then each different set of block states is represented with a separate palette entry.
The new format is clever, because now there can exist a completely ludicrous amount of different types of blocks and state combinations in the game, but only the blocks that exist in a chunk will be stored in the palette. As far as I have seen, untouched generated chunks often need fewer than 16 different palette entries. Chunks that have player-made structures will generally use more combinations of blocks and block states.
An additional detail about the new format is that chunks with 16 or fewer kinds of blocks (including state variations) need 16 or fewer palette indexes, which can be stored in 4 bits per block. Even if the palette is larger, the palette indexes are often less than 8 bits per block (the number of bits per block is the 2-logarithm of the number of palette entries, rounded up). This in itself should save a fair bit of space. In addition, the 4 bits of block state per block was removed and the optional 4 bits of block ID data is no longer needed.
I don’t know what the total change in world size is for typical worlds, but the palette system might save some space overall.