Divide and Conquer Ray Tracing

Recently research has sparked a lot of interest in Divide and Conquer Ray Tracing, in which the usual acceleration structure is abandoned in favor of a constructing a partial structure for each ray bounce. The first papers on the subject have continued to use standard axis aligned splitting planes and based their implementation on kd-trees and bvh’s. Realising that an iteration can consist of millions of rays and only hundreds of thousands of triangles, it makes sense to look at subdivisions that increase the coherence of rays instead of geometry. During this sparetime project I have successfully combined the flexibility of DACRT with 5D ray hierarchies.

The general DACRT algorithm

An introduction to the general dacrt algorithm can be found in Ben Mora’s paper, see References below, For convenience I have also briefly summarized it here.

DACRT(Bound b, Ray[] rs, Primitive[] ps) {
    if (Termination(rs, pp))
        ExhaustiveRayTracing(rs, ps);
    else {
        Bound[] bs = Subdivide(b);
        foreach (Bounds b' in bs) {
            Ray[] rs' = Intersecting(b', rs);
            Primitive[] ps' = Intersecting(b', ps);
            DACRT(b', rs', ps');
        }
    }
}

As with any divide and conquer scheme DACRT works by recursively subdividing the ray tracing problem into smaller problems of a more manageable size. DACRT takes as input a set of rays and primitives and also some volume that bounds the current subset of the scene/problem.

If the problem has been reduced to an appropriate size, then recursion is terminated and exhaustive ray tracing is performed.

Otherwise the bounds, b, are further subdivided into smaller spaces, the rays and primitives are classified according to the new bounds, b’, and a recursive call to DACRT is made.

Bounding Volumes

Any space partitioning bounding volumes can be used in DACRT, but so far only axis aligned bounding volumes have been used, as they are most common when constructing spatial scene hierarchies.

In Arvo and Kirk a 5D hypercube scheme was used for the bounding volumes. The rays where described as an origin, <x, y, z>, a major axis, a, and u and v, which expresses the direction of the ray along the major axis linearly and from -1 to 1. A bounding volume over a set of rays was then given by the hypercube found by taking the minimum and maximum of x, y, z, u, v and their major axis. To determine intersection between the hypercube and geometry, Arvo and Kirk converted the hypercube to a cone. While they also experimented with classification using 7 planes and linear programming, they found that cones where the fastest.

As part of the project I have been experimenting with using bounding volumes defined by splitting planes instead of a cone. However, as opposed to the original 5D ray hierarchy article, I did not use all 7 splitting planes given by the 5D hypercube, but instead approximated it using only the 4 planes spanned by u and v and the plane truncating the cone, effectively creating an openended frustum. The results were a faster ray tracer, but more importantly the frustum is a lot tighter than the original cone, meaning more geometry gets culled each frame. My hopes is that the fewer random access memory lookups will mean quite a bit of extra performance when implementing the idea on GPU’s. A downside to this is of course that 5 planes require more memory than a cone. Storing the plane compressed would alleviate this issue, for example the a, b and c components are all between -1 and 1 and could be stored using halffloats, and then use a full float for the distance component d to be able to span the entire scene. Other, more complex, normal encoding schemes are listed by Aras here.

Termination Criteria

When performing dacrt we are both partitioning the scene’s geometry and the rays traversing that scene, so using SAH as a termination criteria is no longer sufficient.
Instead, since we know the amount of rays traversing the geometric primitives at a given recursive step, the cost of a given dacrt partitioning of a dacrt node n can be expressed as

C_{dacrt}(n \rightarrow \{l, r\}) = C_{ray} \|R_n\| + C_{geom} \|G_n\| + C_{dacrt}(l) + C_{dacrt}(r)

where C_{dacrt} is the cost of performing a dacrt iteration on the node n. \|R_n\| is the amount of rays in n and thus C_{ray} \|R_n\| is the cost of traversing all rays and calculate their splitting info. (Assuming that it is done linearly.) Similarly C_{geom} \|G_n\| is the cost of partitioning all geometry. C_{dacrt}(l) and C_{dacrt}(r) is then the cost of applying dacrt to the two newly created nodes.

When recursion terminates and a leaf node is found, the cost of intersection between rays and geometry is given by

C_{leaf}(n) = C_i R_n G_n

and the optimal termination criteria is

C_{leaf}(n) < C_{dacrt}(n)

i.e. when the cost of performing intersection is lower than the cost of partitioning.

Calculating a globally optimal solution is of course infeasable, so instead we compute a local greedy approximation, where it is assumed that all created nodes are leaf nodes.

\begin{array}{rl}
& C_{dacrt}(n \rightarrow \{l, r\}) = C_{ray} \|R_n\| + C_{geom} \|G_n\| + C_{leaf}(l) + C_{leaf}(r)
\\
\Updownarrow
\\
& C_{dacrt}(n \rightarrow \{l, r\}) = C_{ray} \|R_n\| + C_{geom} \|G_n\| + C_i R_l G_l + C_i R_r G_r
\end{array}

In order to avoid the cost of attempting all splits over n it is assumed that at each step the the node is split in half, meaning that R_l = R_r = R_n / 2 and likewise for geometry, yielding the constant time expression

C_{dacrt}(n \rightarrow \{l, r\}) = C_{ray} \|R_n\| + C_{geom} \|G_n\| + C’_i R_n G_n

where C’_i = C_i / 2.

Combining C_{dacrt} and C_{leaf} and rearranging the constants the final termination criteria becomes

\begin{array}{rl}
& R_n G_n < \frac{C_{ray}}{C_i - C'_i} \|R_n\| + \frac{C_{geom}}{C_i - C'_i} \|G_n\| \\ \Updownarrow \\ &R_n G_n < C_r \|R_n\| + C_g \|G_n\| \end{array}
Source Code

The project’s source can be found on github.

References

Naive-Ray-Tracing: A divide-and-conquer approach by Benjamin Mora. [2011]
Fast Ray Tracing by Ray Classification by Arvo and Kirk. [1987]