Minecraft Source Code is Interesting!
Minecraft’s Old Console Source code got leaked recently(1st March 26), while lot of people started to make it run, mods and made lot of videos, I was just going through it’s source code and found some interesting tricks/algo and things which i thought are cool to share.
In the beginning of time, there was Mojang, and there was 4J. Together they shaped blocky worlds and quiet legends, one update at a time.
Then everything changed when the Fire Nation attacked: Microsoft arrived, and the realm entered a new age...dogshit.
In the legacy that all that there was, here is the source code to minecraft on PS3.
Its shit too, but our kind of shit.
If you don’t know the original Minecraft by Notch was written in Java. Consoles at that time couldn’t run Java, so someone had to rewrite the whole thing in C++. So they hired 4J. 4J Studios is a small Scottish game studio. Mojang hired them to port the original Java Minecraft to consoles.
And what a port it is. ~3,300 C++ source files split into two projects: Minecraft.Client (rendering, UI, platform stuff) and Minecraft.World (gameplay, world generation, networking). Everything from terrain generation to creeper explosions to redstone, rewritten from scratch in C++ targeting hardware with 256MB of RAM and the PS3’s weird Cell processor.
Let’s get into the interesting bits.
§ Packing a Pointer and a Counter Into One Number
In SparseLightStorage.cpp, they pack a memory pointer and a plane count into a single 64-bit integer:
dataAndCount = 0x007F000000000000L | ((__int64)planeIndices & 0x0000ffffffffffffL);
What’s going on here? On x86-64, virtual addresses only use 48 bits out of 64, leaving the upper 16 bits constrained to zero on this architecture. So 4J used those upper bits to store the count of allocated light planes.
To read it back:
int count = (sourceDataAndCount >> 48) & 0xffff;
unsigned char *ptr = (unsigned char *)(sourceDataAndCount & 0x0000ffffffffffffL);
Lower 48 bits = pointer. Upper 16 bits = count. One value, two pieces of information, and they can atomically swap the whole thing with a single compare-and-exchange:
__int64 last = InterlockedCompareExchangeRelease64(&dataAndCount, newDataAndCount, lastDataAndCount);
This means reading light data requires zero locks. No mutex, no spinlock, nothing. Reads are completely free. Only writes pay the cost of copying and swapping, and writes are rare since light doesn’t change that often. The compiler even throws a warning about truncating pointers, and they just… suppress it:
#pragma warning ( disable : 4826 )
You wouldn’t get away with this in most codebases, but here it saves millions of lock acquisitions per second.
§ Garbage Collector in 20 Lines
The pointer trick creates a problem though: when you update the light data, the old pointer is still being read by other threads. You can’t free it immediately. Their solution? A rotating set of 3 delete queues:
void SparseLightStorage::queueForDelete(unsigned char *data)
{
deleteQueue[deleteQueueIndex].Push(data);
}
void SparseLightStorage::tick()
{
int freeIndex = (deleteQueueIndex + 1) % 3;
unsigned char *toFree = NULL;
do
{
toFree = deleteQueue[freeIndex].Pop();
if (toFree) free(toFree);
} while (toFree);
deleteQueueIndex = (deleteQueueIndex + 1) % 3;
}
Every tick, the write index advances by one. But they always free from the queue that’s two positions ahead in the rotation. This guarantees every pointer sits in the queue for at least 2 full game ticks before being freed, giving all readers time to finish. It’s a hand-rolled epoch-based memory reclamation scheme, similar to what you’d find in Linux kernel data structures, but in about 20 lines of code.
§ Light Compression: Most of Your World is Either Dark or Bright
Lighting data in Minecraft is 4 bits per block (values 0-15). A naive chunk stores 128 x 16 x 16 x 0.5 = 16,384 bytes of light data. But here’s the insight: most horizontal planes of light data are either all zeros (block light underground) or all fifteens (sky light above ground). Why store 128 bytes for a plane that’s just “15 everywhere”?
static const int ALL_0_INDEX = 128; // magic: "this plane is all dark"
static const int ALL_15_INDEX = 129; // magic: "this plane is all bright"
Each of the 128 Y-levels gets a one-byte index. If it’s 128 or 129, no data is allocated, the value is implicit. Only planes with mixed lighting actually get 128 bytes of storage. In a typical chunk, maybe 20-30 planes out of 128 actually need real data. That’s an 80% reduction in memory just from noticing that most of the world is boring.
Think about what this means in practice. You’re running around in a Minecraft world and every chunk around you needs lighting data loaded in memory. Without this, each chunk eats 16KB just for light. With it, most chunks use around 3KB. On a console with 256MB of total RAM shared between the game, the OS, and everything else, that’s the difference between rendering 9 chunks and rendering 40.
§ Bit-Scattered Coordinates (a.k.a. “Why is the Index Function So Weird?”)
The compressed tile storage splits each 16x128x16 chunk into 512 blocks of 4x4x4 tiles. To convert an (x, y, z) coordinate into a block and tile index, you’d expect simple division and modulo. Instead:
inline void CompressedTileStorage::getBlockAndTile(int *block, int *tile, int x, int y, int z)
{
*block = ((x & 0x0c) << 5) | ((z & 0x0c) << 3) | (y >> 2);
*tile = ((x & 0x03) << 4) | ((z & 0x03) << 2) | (y & 0x03);
}
And to convert a (block, tile) pair into a memory offset:
inline int CompressedTileStorage::getIndex(int block, int tile)
{
// bits for index into data is: xxxxzzzzyyyyyyy
// we want block(b) & tile(t) spread out as:
// from: ______bbbbbbbbb
// to: bb__bb__bbbbb__
int index = ((block & 0x180) << 6) | ((block & 0x060) << 4) | ((block & 0x01f) << 2);
index |= ((tile & 0x30) << 7) | ((tile & 0x0c) << 5) | (tile & 0x03);
return index;
}
This is a hand-rolled Z-order curve (Morton code). By interleaving the bits of X, Y, and Z coordinates, blocks that are spatially close in 3D end up close in memory. When you’re iterating over a region of blocks (which happens constantly: lighting updates, rendering, collision), you get way better cache hit rates than with a naive row-major layout.
Why does this matter? Every time a torch is placed or removed, the lighting engine walks through nearby blocks to propagate light. With row-major layout, moving one step in the Z direction might jump thousands of bytes in memory, blowing the cache. With Z-order, your neighbors in 3D are your neighbors in memory. On a PS3 where an L1 cache miss could cost 50+ cycles, this turns a lighting update from “noticeable stutter” into “instant.”
§ Self-Describing Compression Tags
Each 4x4x4 block’s index is a 16-bit value. The lowest bits tell you how the data is compressed:
oooooooooooooo00 → 1 bit per tile (2 unique block types)
oooooooooooooo01 → 2 bits per tile (4 unique block types)
oooooooooooooo10 → 4 bits per tile (16 unique block types)
ooooooooooooo011 → 8 bits per tile (full, uncompressed)
tttttttt-----111 → 0 bits per tile (uniform block, tile ID is IN the index)
That last one is the best part. If an entire 4x4x4 block is the same tile (which is extremely common, think solid stone, air, water), the tile ID is stored directly in the upper 8 bits of the index itself. No data pointer, no indirection. The “index” IS the data.
And there’s a clever encoding trick: the 8-bit-per-tile case requires 4-byte alignment, so its offset’s lowest bit is always 0. They repurpose that bit as a flag to distinguish between “8 bits per tile” and “0 bits per tile”. Five compression modes encoded in 2-3 bits.
The result? From the comments:
“on balance this still was found to give much better compression - around 12.5% vs 19% by doing things per plane.”
They compress a 32KB chunk down to about 4KB. That matters because a player exploring a world is constantly loading and unloading chunks. Every KB saved per chunk is a KB you can spend on having more chunks loaded at once, which means a longer draw distance, which means the world feels bigger.
§ Bypassing the Heap Because the Heap is Broken
Speaking of that compression, here’s how they allocate memory for it:
indicesAndData = (unsigned char *)XPhysicalAlloc(32768 + 4096, MAXULONG_PTR, 4096, PAGE_READWRITE);
That’s XPhysicalAlloc, direct physical memory allocation on Xbox 360, bypassing the heap entirely. Why? The comment explains:
“Another compromise is being made on how the memory is allocated. This is all currently done with physical allocations to bypass the general heap manager, in particular to allow the freeing of memory to actually free whole memory pages cleanly rather than leaving them as managed by the heap manager.”
When you free() a small allocation through the standard heap, the heap manager keeps the page around for reuse. Over time, this fragments badly and you end up with 256MB of RAM full of holes. By allocating in raw 4KB physical pages, every free genuinely returns the memory. The tradeoff is waste: a chunk that compresses to 4.1KB needs an 8KB allocation. But they measured it:
“in testing absolutely no chunks were seen that got close to going over 8K compressed”
So the waste ceiling is known and acceptable. They chose predictable waste over unpredictable fragmentation. This is the kind of decision that matters after hour three of a play session, when a normal heap allocator would have fragmented your memory into tiny unusable gaps and the game starts hitching every time a new chunk loads.
§ Cloning Java, Down to the Standard Library
This is a C++ port of a Java game. But it’s not just the game logic that got ported. They had to clone chunks of Java’s standard library too, because the same world seed has to produce the same world on PS3 as on Java Edition. One wrong bit and the terrain diverges.
§ The Random Number Generator
They reproduce java.util.Random exactly. Same algorithm, same constants, same quirks:
void Random::setSeed(__int64 s)
{
this->seed = (s ^ 0x5DEECE66DLL) & ((1LL << 48) - 1);
haveNextNextGaussian = false;
}
int Random::next(int bits)
{
seed = (seed * 0x5DEECE66DLL + 0xBLL) & ((1LL << 48) - 1);
return (int)(seed >> (48 - bits));
}
0x5DEECE66D and 0xB are Java’s magic constants for its linear congruential generator. The 48-bit mask is Java’s internal state width. If even one bit differs, world generation diverges and your PS3 world doesn’t match your friend’s Java world.
The nextDouble() is interesting too:
double Random::nextDouble()
{
return (((__int64)next(26) << 27) + next(27)) / (double)(1LL << 53);
}
A double has 53 bits of mantissa precision, but next() can only give 31 bits at a time. So they call it twice (once for 26 bits, once for 27 bits), shift and add to get 53 bits, then divide by 2^53 to normalize to [0.0, 1.0). This exactly matches what Java does internally.
There’s also a power-of-2 fast path:
int Random::nextInt(int n)
{
if ((n & -n) == n) // i.e., n is a power of 2
return (int)(((__int64)next(31) * n) >> 31);
// ... rejection sampling for non-power-of-2
}
(n & -n) == n is a bit trick to detect powers of two. In two’s complement, -n flips all bits and adds 1, so n & -n isolates the lowest set bit. If that equals n, there’s only one set bit, so it’s a power of two. In that case, you can use multiplication and shifting instead of modulo, which avoids the bias that % n introduces for non-power-of-2 values.
§ Java’s Hash Functions
Same deal with hash maps. Java’s HashMap applies a “supplemental hash” to spread keys more evenly. If the C++ port uses a different hash, the same data produces different bucket layouts, and anything that depends on iteration order breaks. So:
// JavaIntHash.h
// This code implements the supplemental hashing that happens in java
// so we can match what their maps are doing with ints.
typedef struct
{
int operator() (const int &k) const
{
int h = k;
h += ~(h << 9);
h ^= (((unsigned int)h) >> 14);
h += (h << 4);
h ^= (((unsigned int)h) >> 10);
return h;
}
} IntKeyHash;
And for 64-bit keys (used for chunk coordinates encoded as longs):
int operator() (const __int64 &k) const
{
return hash((int)(k ^ (((__uint64)k) >> 32)));
}
XOR the upper and lower 32 bits together, then run it through the supplemental hash. This is directly from Java’s LongHashMap. They’re not just porting the game logic, they’re porting the standard library behavior.
§ Java’s Streams, File Classes, and Everything Else
It goes deeper than Random and HashMap. They reimplemented InputStream, OutputStream, DataInputStream, DataOutputStream, File, FileInputStream, FileOutputStream, ByteArrayInputStream, and FileFilter. All with comments like:
// 4J Stu - Represents Java standard library class
They also ported Java’s Comparable interface for tick scheduling:
// 4J Stu - In Java TickNextTickData implements Comparable<TickNextTickData>
And Java’s List.hashCode() for potion effects:
// 4J Stu - Based on implementation of Java List.hashCode()
// at http://docs.oracle.com/javase/6/docs/api/java/util/List.html#hashCode()
Even Java’s Collections.sort got replaced:
// 4J Stu Added so we can use std::sort instead of the java Collections::sort
The whole codebase is full of these little adaptations. Every time Java had some built-in behavior they relied on, 4J had to figure out what that behavior actually was and reproduce it in C++. Not just the documented behavior either. Things like how HashMap iteration order works, or how Java’s Random seeds itself. These are implementation details that aren’t guaranteed by the Java spec but that Minecraft’s code depends on anyway.
§ Explosions: 1,352 Rays Instead of Physics
Explosions in Minecraft don’t use a physics engine. They fire rays. Specifically, they iterate over the surface cells of a 16x16x16 grid (skipping the interior), creating ~1,352 rays pointing outward from the explosion center:
for (int xx = 0; xx < 16; xx++)
for (int yy = 0; yy < 16; yy++)
for (int zz = 0; zz < 16; zz++)
{
if ((xx != 0 && xx != 15) && (yy != 0 && yy != 15) && (zz != 0 && zz != 15))
continue; // skip interior
// Normalize direction to sphere surface
double xd = xx / 15.0f * 2 - 1;
double yd = yy / 15.0f * 2 - 1;
double zd = zz / 15.0f * 2 - 1;
double d = sqrt(xd * xd + yd * yd + zd * zd);
xd /= d; yd /= d; zd /= d;
float remainingPower = r * (0.7f + level->random->nextFloat() * 0.6f);
float stepSize = 0.3f;
while (remainingPower > 0)
{
int t = level->getTile(xt, yt, zt);
if (t > 0)
remainingPower -= (Tile::tiles[t]->getExplosionResistance(source) + 0.3f) * stepSize;
if (remainingPower > 0)
toBlow.insert(TilePos(xt, yt, zt));
xp += xd * stepSize;
yp += yd * stepSize;
zp += zd * stepSize;
remainingPower -= stepSize * 0.75f;
}
}
Each ray starts with some randomized power (0.7 + random * 0.6), then steps forward in 0.3-block increments. Each block it passes through subtracts its explosion resistance from the remaining power. When power hits zero, the ray stops. This is basically a voxelized pressure wave simulation. No spatial queries, no physics broadphase, just marching through a grid subtracting numbers. It’s why obsidian walls stop TNT and why dirt doesn’t.
There’s also a nice bug fix comment in the same file:
// 4J Stu - Fix for #46606 - TU5: Content: Gameplay: The player can be damaged
// and killed by explosions behind obsidian walls
Turns out the ray check only determined which blocks to destroy, but damage to entities was calculated separately using distance alone. So you could stand behind obsidian and still die. The fix checks if the player’s bounding box actually intersects with any block that would be destroyed.
§ Entity IDs: A Bitfield Allocator
For context: in multiplayer Minecraft, the server and all clients need to agree on which entity is which. When a zombie spawns on the server, it gets an ID, and that ID is sent to every connected client so they can all track the same zombie. If IDs get mixed up, you get invisible mobs, ghost entities, or hits that don’t register.
Every mob, arrow, and minecart needs one of these network IDs. Instead of a simple incrementing counter, they use a bitfield allocator with 2048 slots:
unsigned int Entity::entityIdUsedFlags[2048/32] = {0}; // 64 uint32s = 2048 bits
int Entity::getSmallId()
{
for (int i = 0; i < (2048 / 32); i++)
{
unsigned int uiFlags = *puiUsedFlags;
if (uiFlags != 0xffffffff) // any free bit in this word?
{
unsigned int uiMask = 0x80000000;
for (int j = 0; j < 32; j++)
{
if ((uiFlags & uiMask) == 0) // found free slot
{
uiFlags |= uiMask;
*puiUsedFlags = uiFlags;
return i * 32 + j;
}
uiMask >>= 1;
}
}
puiUsedFlags++;
}
}
Network-synced entities get IDs 0-2047 (fits in 11 bits, saving bandwidth). Local-only entities get IDs starting at 2048+. They maintain three parallel bitfield arrays (used, wander, and removing) to prevent reusing an ID before the client has been told the old entity was removed. Otherwise you get ghost entities or invisible mobs.
The failure case:
app.DebugPrintf("Out of small entity Ids... possible leak?\n");
__debugbreak();
§ How Caves Are Carved
Cave generation uses a “worm” algorithm. A point moves through space with a gradually changing direction, and at each step it carves out an ellipsoidal bubble:
double r = (Mth::sin(d / 16.0f * PI) * radius + 1) * ss + 1;
The sin(d/16 * PI) creates a bell curve. Caves widen in the middle and pinch at both ends, which is why cave entrances are narrow and open up inside. At each step along the worm, blocks within the ellipsoid are removed if they pass a randomized distance check:
if (xd * xd + yd * yd + zd * zd < random->nextDouble() * fuss + (1 - fuss))
The fuss factor adds roughness to the cave walls. Without it, caves would be smooth ellipsoids. With it, they get that jagged, natural look.
§ Water Knows What Color It Should Be
Water at biome boundaries doesn’t just pick one biome’s color. It averages all 9 neighboring biome colors:
for (int oz = -1; oz <= 1; oz++)
for (int ox = -1; ox <= 1; ox++)
{
int waterColor = level->getBiome(x + ox, z + oz)->getWaterColor();
totalRed += (waterColor & 0xff0000) >> 16;
totalGreen += (waterColor & 0xff00) >> 8;
totalBlue += (waterColor & 0xff);
}
return (((totalRed / 9) & 0xFF) << 16) |
(((totalGreen / 9) & 0xFF) << 8) |
(((totalBlue / 9) & 0xFF));
Simple average of R, G, B channels across a 3x3 grid. It’s why water transitions smoothly between swamps and oceans instead of having a hard edge.
And in the same file, there’s a disabled waterfall foam feature:
// 4J-PB - this loop won't run!
for (int i = 0; i < 0; i++)
{ // This was an attempt to add foam to
// the bottoms of waterfalls. It
// didn't went ok.
for (int i = 0; i < 0; i++). The most honest way to disable code. Not a comment, not an #ifdef. Just a loop that runs zero times with a note that says “didn’t went ok.” Respect.
§ Trees Grow Top-Down (For a Good Reason)
Tree generation places leaves before the trunk, and works from the highest Y coordinate downward:
// 4J Stu - Generate leaves from the top down to stop having to recalc heightmaps
for (int yy = y + treeHeight; yy >= y - grassHeight + treeHeight; yy--)
{
int yo = yy - (y + treeHeight);
int offs = extraWidth + 1 - yo / 2;
for (int xx = x - offs; xx <= x + offs; xx++)
for (int zz = z - offs; zz <= z + offs; zz++)
{
if (abs(xo) == offs && abs(zo) == offs && (random->nextInt(2) == 0 || yo == 0))
continue; // randomly skip corners for organic shape
if (!Tile::solid[level->getTile(xx, yy, zz)])
placeBlock(level, xx, yy, zz, Tile::leaves_Id, leafType);
}
}
Why top-down? Every time you place a block, the heightmap (the “tallest solid block at each X,Z”) might need updating. If you build a tree bottom-up, every single leaf placement potentially triggers a heightmap recalculation. Top-down means the first block placed at each column is already the highest one. The heightmap only updates once per column instead of potentially many times. Small thing, but across thousands of trees during world generation, it adds up.
The corner-skipping logic (abs(xo) == offs && abs(zo) == offs) gives tree canopies their rounded shape instead of being perfect squares.
§ Zombie Sunlight: Probability Scales With Brightness
Zombies catching fire in sunlight isn’t a binary check. It’s probabilistic:
if (br > 0.5f && random->nextFloat() * 30 < (br - 0.4f) * 2
&& level->canSeeSky(Mth::floor(x), Mth::floor(y), Mth::floor(z)))
{
setOnFire(8);
}
random * 30 < (brightness - 0.4) * 2 means at brightness 0.5, there’s about a 0.67% chance per tick. At full brightness (1.0), it’s about 4% per tick. Zombies near trees or under clouds last slightly longer. It’s a small detail but it’s what makes the behavior feel natural rather than robotic.
§ Charged Creepers
void Creeper::thunderHit(const LightningBolt *lightningBolt)
{
Monster::thunderHit(lightningBolt);
entityData->set(DATA_IS_POWERED, (byte)1);
}
And at detonation:
if (isPowered()) level->explode(shared_from_this(), x, y, z, 6, destroyBlocks);
else level->explode(shared_from_this(), x, y, z, 3, destroyBlocks);
One boolean flag. Blast radius goes from 3 to 6. Since the explosion algorithm fires rays with power proportional to radius, and the area affected grows quadratically, a charged creeper is roughly 4x more destructive. All from a single byte of entity data set by a lightning strike.
§ How Terrain Gets Its Shape
Terrain generation uses Ken Perlin’s improved noise function. The gradient computation is compact bit manipulation:
double ImprovedNoise::grad(int hash, double x, double y, double z)
{
int h = hash & 15;
double u = h < 8 ? x : y,
v = h < 4 ? y : h == 12 || h == 14 ? x : z;
return ((h & 1) == 0 ? u : -u) + ((h & 2) == 0 ? v : -v);
}
4 bits of hash select from 12 gradient directions. No arrays, no lookups, no branches beyond two ternaries. The h == 12 || h == 14 case wraps the gradient table back around so 16 hash values map to 12 directions evenly.
The smoothing polynomial that blends between gradients:
double u = x * x * x * (x * (x * 6 - 15) + 10); // 6x^5 - 15x^4 + 10x^3
This is the “improved” in Improved Perlin Noise. The original used 3x^2 - 2x^3, which has continuous first derivatives but discontinuous second derivatives, causing visible artifacts. This quintic has continuous first AND second derivatives, making terrain smoother. It’s computed in Horner form for efficiency.
Multi-octave noise stacks multiple frequencies:
double value = 0;
double pow = 1;
for (int i = 0; i < levels; i++)
{
value += noiseLevels[i]->getValue(x * pow, y * pow, z * pow) / pow;
pow /= 2;
}
Each octave doubles the frequency and halves the amplitude. Low octaves give broad hills, high octaves add fine detail. This is what gives Minecraft terrain its fractal-like quality where you see similar shapes at different scales.
The noise wraps at 2^24 (16.7 million blocks):
xb %= 16777216;
zb %= 16777216;
Past 16.7 million blocks from spawn, the terrain repeats. You’ll never notice.
§ AABB Pool: 1024 Objects, Zero Allocations
Collision detection creates temporary axis-aligned bounding boxes constantly. Allocating them on the heap would kill performance. So they use a circular buffer per thread:
AABB *AABB::newTemp(double x0, double y0, double z0, double x1, double y1, double z1)
{
ThreadStorage *tls = (ThreadStorage *)TlsGetValue(tlsIdx);
AABB *thisAABB = &tls->pool[tls->poolPointer];
thisAABB->set(x0, y0, z0, x1, y1, z1);
tls->poolPointer = (tls->poolPointer + 1) % ThreadStorage::POOL_SIZE;
return thisAABB;
}
1024 pre-allocated AABBs per thread. No new, no delete, no lock contention. The pointer wraps around and overwrites old ones. This only works because temp AABBs are short-lived, used within a single computation and discarded. If you hold a reference too long, it silently gets overwritten. Dangerous if you hold a reference too long, but it works for their use case.
§ XOR Tree for Fast Memory Comparison
When checking if a chunk has changed (to decide whether to re-compress it), they compare 64-byte blocks using manual SIMD:
__int64 d0 = pOld[0] ^ pNew[0];
__int64 d1 = pOld[1] ^ pNew[1];
__int64 d2 = pOld[2] ^ pNew[2];
__int64 d3 = pOld[3] ^ pNew[3];
__int64 d4 = pOld[4] ^ pNew[4];
__int64 d5 = pOld[5] ^ pNew[5];
__int64 d6 = pOld[6] ^ pNew[6];
__int64 d7 = pOld[7] ^ pNew[7];
d0 |= d1; d2 |= d3; d4 |= d5; d6 |= d7;
d0 |= d2; d4 |= d6;
if (d0 | d4) return false;
XOR finds differences, OR propagates them upward in a binary tree. Eight 64-bit compares, four ORs, two more ORs, one final check. 512 bits compared in one branch. memcmp would work, but this is branchless until the very end. No early exits means no branch mispredictions. Chunks are compared against their last-saved state every few ticks to decide if they need re-compressing. Over a busy session this runs millions of times, so every unnecessary branch miss adds up.
§ Sign Extension for Coordinate Recovery
When coordinates are packed into a cache entry and you need them back, you’ve lost the upper bits. If the coordinate was negative, those lost bits should all be 1s (sign extension). Their trick:
xx = (xx << 22) >> 22;
Left-shift pushes the sign bit to position 31. Arithmetic right-shift copies it back down through all the vacated positions. One line, no branches, works for both positive and negative coordinates. The light cache is hit on nearly every block lookup, so shaving even a single branch here matters across a whole frame.
§ Developer Comments Worth Reading
Scattered through the code are comments from the 4J developers that give you a window into what porting this game was actually like. A few favorites:
// 4J Stu - No idea why this is protected in Java
// 4J Stu - I don't know why this is protected in Java
They kept running into access modifiers in the Java code that didn’t make sense but couldn’t be changed without risking something breaking.
// 4J Stu - Changed this a little from Java to be less funn
// 4J Stu - Changed this a little from the Java so it's less funny
These are in Inventory.cpp. The original Java inventory code apparently did something “funny” with slot indexing that 4J straightened out. Two separate comments in the same file about making code “less funny.”
// 4J-HEG - Removed this to prevent weird tooltips (place steak on stairs!?)
At some point, stairs had a “use” interaction that caused tooltip text to suggest you could place steak on them.
// 4J - don't know what this special y collision is specifically for,
// but its definitely bad news
Sometimes you port code you don’t fully understand and just leave a note for whoever comes after you.
// 4J-JEV: I have no idea what was going on here
// (it gets changed in a later java version).
// 4J-PB - Not sure why this was disabled for creative mode, so commenting out
And the whole codebase is peppered with // 4J - TODO was synchronized, marking every place where the Java version used synchronized blocks that they had to handle differently in C++. There are dozens of these.
This code was never meant to be read by anyone outside 4J Studios. There’s no documentation site, no architecture overview, no public API. Just source files with honest comments like “I have no idea what was going on here” and “didn’t went ok.” The kind of things you write when you know only your teammates will ever see it.
And that’s exactly what makes it interesting. Most code you can study publicly is written with an audience in mind. Open source projects, textbooks, blog tutorials, they’re all performing a little. This isn’t. This is what real production code looks like when nobody’s watching. Pointers packed with metadata. Heap allocators bypassed because they’re broken. Three rotating delete queues because someone needed lock-free memory reclamation and had an afternoon to write it.
None of it is pretty. Most of it wouldn’t pass a modern code review. But it ran at 60fps on a console with 256MB of RAM, and millions of people played it without ever knowing any of this existed.
Subscribe for future posts