smallpt in Rust

With Rust alpha 2, Rust has finally reached a level of stability where I feel comfortable exploring it and not worrying about spending more time updating my code due to Rust changes than I will spend time actually coding and learning Rust.

For my first Rust experiment I ported Kevin Beason’s smallpt to Rust. The result can be found here. The code hasn’t been squished into 99 lines, but is instead slightly more readable than the original. The reason for picking smallpt as a first project was to get a feel for Rust’s math library and try some simple threading with dataparallel loops. And of course render a nice image in the most inefficient way possible! :)

Cornell Spheres

So far I’m quite happy with Rust: The functional inspired syntax is nice and simple. Pattern matching and struct destructuring is quite powerful and makes for some very safe code patterns in combination with Option and Result. The scoped thread I used for dataparallelism was quite simple to use and it would be easy to wrap the whole thing in a dataparallel for_each method, which might be my next little Rust project. However, the borrow checker, which is Rust best and most unique selling point, is also the hardest part about Rust. Even in my tiny project with only a tiny amount of parallelism, I had to fight the borrow checker and try to interpret compiler errors. I am sure that if I spend enough time with Rust, me and the borrow checker could become best of friends, but there is a steep learning curve to it. Hopefully it will be worth it and I can transfer the ownership idioms and patterns that I learn in Rust back to my other projects in other languages.

Morton Codes

I’ve recently had to use 3D and 5D Morton codes for spatial datastructure construction. Unfortunately I found that the material on the net actually describing how to encode, decode and interprete Morton codes is non-existent to shallow at best. The best sources I’ve found are wikipedia and an excellent blog post by Fabian “ryg” Giesen. This spurred me to create this blog post about my own experience with Morton codes and link to resources with more information.


Morton codes provides an ordering along a space-filling curve while preserving data locality. Lets break that down a bit shall we?

A space-filling curve is a curve whose range contains the entire N-dimensional hypercube enclosing your data domain mapped to discrete cells. When working with 3-dimensional data the hypercube is usually just an axis aligned bounding box. What this all means is that the data is enclosed by the hypercube and that hypercube is then subdivided along each dimension, creating an N-dimensional grid, lets call that a hypergrid. A space-filling curve that contains the entire hypercube is guaranteed to pass through each cell of the grid.

A space-filling curve preserves data locality if similar data gets mapped within the same range on the curve. This means that a cell in the hypergrid gets mapped to a range on the curve and therefore all the data inside that cell gets mapped to a position inside that range. A good space filling curve will also map neighbouring cells close each other.

Morton codes provides such a spatially local mapping. Below are three iterations of the Morton order in 2- and 3-dimensions, where each iteration adds a new subdivision to the cells of the previous iteration.

MortonCurve 2x2

MortonCurve 2×2

MortonCurve 4x4

MortonCurve 4×4

MortonCurve 8x8

MortonCurve 8×8

MortonCurve 2x2x2

MortonCurve 2x2x2

MortonCurve 4x4x4

MortonCurve 4x4x4

MortonCurve 8x8x8

MortonCurve 8x8x8

Encoding Morton Codes

Creating the Morton code for some data amounts to determining coordinates (x, y) of the data inside the hypergrid and interleaving those coordinates. The figure below shows the Morton codes and curve in the 2-dimensional case with a 8×8 grid. Interleaving the binary representation of a cell’s coordinates yields the Morton code for the respective cell and connecting the cells in order of increasing Morton code connects the entire Morton curve.

Binary Morton curve 8x8

Binary Morton curve creation

Implementing a function that can create Morton codes boils down to two problems: First seperate the least significant bits in c_x and c_y by N-1 and secondly combine them into a Morton code.

The implementation that separates the bits via a classical Divide-and-Conquer approach looks like this. In the comments after each line you can see the position of the significant bits.

unsigned int SeparateBy1(unsigned int x) {
    x &= 0x0000ffff;                  // x = ---- ---- ---- ---- fedc ba98 7654 3210
    x = (x ^ (x <<  8)) & 0x00ff00ff; // x = ---- ---- fedc ba98 ---- ---- 7654 3210
    x = (x ^ (x <<  4)) & 0x0f0f0f0f; // x = ---- fedc ---- ba98 ---- 7654 ---- 3210
    x = (x ^ (x <<  2)) & 0x33333333; // x = --fe --dc --ba --98 --76 --54 --32 --10
    x = (x ^ (x <<  1)) & 0x55555555; // x = -f-e -d-c -b-a -9-8 -7-6 -5-4 -3-2 -1-0
    return x;

Adapting the code to separate by two, three or four is pretty straightforward. Since I needed separation by 4 for a 5D ray representation I'll include that here.

unsigned int SeparateBy4(unsigned int x) {
    x &= 0x0000007f;                  // x = ---- ---- ---- ---- ---- ---- -654 3210
    x = (x ^ (x << 16)) & 0x0070000F; // x = ---- ---- -654 ---- ---- ---- ---- 3210
    x = (x ^ (x <<  8)) & 0x40300C03; // x = -6-- ---- --54 ---- ---- 32-- ---- --10
    x = (x ^ (x <<  4)) & 0x42108421; // x = -6-- --5- ---4 ---- 3--- -2-- --1- ---0
    return x;

Once we are able to separate the bits in the coordinate's binary representation, interleaving the result to obtain the Morton code is a simple matter of bit-shifting and or'ing.

MortonCode MortonCode2(unsigned int x, unsigned int y) {
    return SeparateBy1(x) | (SeparateBy1(y) << 1);

MortonCode MortonCode5(unsigned int x, unsigned int y, unsigned int z, unsigned int u, unsigned int v) {
    return SeparateBy4(x) | (SeparateBy4(y) << 1) | (SeparateBy4(z) << 2) | (SeparateBy4(u) << 3) | (SeparateBy4(v) << 4);
Decoding Morton Codes

Given a Morton code it can be useful to know which coordinates were used to generate it by deconstructing or decoding it. Doing this amounts to reversing the encoding, which is done by inverting the shifts and reversing the order of the masks, yielding the following functions.

unsigned int CompactBy1(unsigned int x) {
    x &= 0x55555555;                  // x = -f-e -d-c -b-a -9-8 -7-6 -5-4 -3-2 -1-0
    x = (x ^ (x >>  1)) & 0x33333333; // x = --fe --dc --ba --98 --76 --54 --32 --10
    x = (x ^ (x >>  2)) & 0x0f0f0f0f; // x = ---- fedc ---- ba98 ---- 7654 ---- 3210
    x = (x ^ (x >>  4)) & 0x00ff00ff; // x = ---- ---- fedc ba98 ---- ---- 7654 3210
    x = (x ^ (x >>  8)) & 0x0000ffff; // x = ---- ---- ---- ---- fedc ba98 7654 3210
    return x;

unsigned int CompactBy4(unsigned int x) {
    x &= 0x42108421;                  // x = -6-- --5- ---4 ---- 3--- -2-- --1- ---0
    x = (x ^ (x >>  4)) & 0x40300C03; // x = -6-- ---- --54 ---- ---- 32-- ---- --10
    x = (x ^ (x >>  8)) & 0x0070000F; // x = ---- ---- -654 ---- ---- ---- ---- 3210
    x = (x ^ (x >> 16)) & 0x0000007F; // x = ---- ---- ---- ---- ---- ---- -654 3210
    return x;

Using the CompactByN functions, we can construct a MortonDecodeN function similar to MortonCodeN seen above.

void MortonDecode2(MortonCode c, unsigned int x, unsigned int y) {
    x = CompactBy1(c);
    y = CompactBy1(c >> 1);

void MortonDecode5(MortonCode c, unsigned int x, unsigned int y, unsigned int z, unsigned int u, unsigned int v) {
    x = CompactBy4(c);
    y = CompactBy4(c >> 1);
    z = CompactBy4(c >> 2);
    u = CompactBy4(c >> 3);
    v = CompactBy4(c >> 4);
Constructing Acceleration Datastructures

Recently a lot of methods for creating acceleration datastructures for ray tracing have focused on using space-filling curves to speed up construction time, fx LBVH and HLBVH. The idea is simple: Map your multi-dimensional data onto the one-dimensional space-filling curve, sort the data according to its position on the curve and then create an acceleration structure on top of the cells. Due to the recursive nature of Morton codes, creating the acceleration structure is very simple.

An interesting observation about Morton codes and acceleration structures is that the binary representation of a Morton code, m, can be interpreted as a traversal of an implicit binary tree to reach m's cell. The traversal is performed by looking at each bit in turn, starting with the most significant bit. Each bit represents a splitting of the domain's data along the axis that the bit derived its value from. The value of the bit will then tell us which side of that splitting plane our traversal should proceed to, with 0 meaning we traverse the 'lower' child and 1 meaning the 'upper' child.

This is best clarified by example so again we will look at the figure with binary Morton codes.

Binary Morton curve 8x8

Binary Morton code representation

Using cell (2, 3) as our example cell we are working with the Morton code 001110. Starting from the tree's root node, which spans the entire domain, (0,0) -> (7,7), the first bit 001110 tells us that our bounds should be split along the spatial y-median, between y=3 and y=4, and that the cell we are looking for is found among the lower numbered cells. Splitting (0,0) -> (7,7) along y's spatial median we get the new bounds (0,0) -> (7,3).

Next the second bit 001110 tells us that we should split the bounds along the spatial x-median, between x=3 and x=4, and again visit the lower child. This yields the new bounds (0,0) -> (3,3)

Repeating this process for all bit in the morton code we get the bounds
(0,0) -> (3,3) => (0,2) -> (3,3) => (2,2) -> (3,3) => (2,3) -> (3,3) => (2,3) -> (2,3)
where the last bounds enclose our starting cell (2,3) perfectly.

Advanced Operations

Usually, knowing how to encode and decode Morton codes is enough to sort data and create acceleration structures around that data. However, recently I've had to recursively assign geometry to a sorted list of rays, starting with assigning the geometry seen by all the rays to the root node and then recursively refine that for each child. In order to do that I found it helpful to do several things.

  • Given two Morton codes, m0 and m1, determine the Morton code that encodes the minimum coordinates of both m0 and m1.
  • Given two Morton codes determine the encoding of their maximum coordinates.
Minimum and maximum

Finding the Morton code that encodes the minimum coordinates of two Morton codes is trivial.

MortonCode Minimum2(MortonCode lhs, MortonCode lhs) {
   unsigned int lhsX, lhsY, rhsX, rhsY;

   // Decode the left hand side coordinates.
   MortonDecode2(lhs, lhsX, lhsY);

   // Decode the right hand side coordinates.
   MortonDecode2(rhs, rhsX, rhsY);

   // Find the minimum of the coordinates and re-encode.
   return MortonCode2(Min(lhsX, rhsX), Min(lhsY, rhsY));

But we can do better! Decoding and re-encoding means doing a lot of bit-shifting instructions that we'd like to avoid. First, realising that we're not interested in the actual values of x and y, but rather the minimum of their encoded values is a tremendous help. Secondly, SeperateBy1 preserves ordering, so if X < Y then SeperateBy1(X) < SeperateBy1(Y), as long as X and Y are valid coordinates. Using this we can construct a Minimum function that avoids a lot of instructions.

MortonCode Minimum2(MortonCode lhs, MortonCode lhs) {
   // Isolate the encoded coordinates.
   unsigned int lhsX = lhs & 0x55555555; // lhsX = -f-e -d-c -b-a -9-8 -7-6 -5-4 -3-2 -1-0
   unsigned int rhsX = rhs & 0x55555555; // rhsX = -f-e -d-c -b-a -9-8 -7-6 -5-4 -3-2 -1-0
   unsigned int lhsY = lhs & 0xAAAAAAAA; // lhsY = f-e- d-c- b-a- 9-8- 7-6- 5-4- 3-2- 1-0-
   unsigned int rhsY = rhs & 0xAAAAAAAA; // rhsY = f-e- d-c- b-a- 9-8- 7-6- 5-4- 3-2- 1-0-

   // Find the minimum of the encoded coordinates and combine them into a single word.
   return Min(lhsX, rhsX) + Min(lhsY, rhsY);

Maximum can be constructed in a similar manner.