There is a very simple algorithm to determine "which" objects in the scene a ray first intersects:
algorithm to determine closest object intersected by a ray
for each object in the scene
if the ray intersects with the object then
if the object is closest to the ray source then
the object is the closest intersected by the ray
endif
endif
endfor
endalgorithm
Basically the new algorithm is as follows:
algorithm to determine closest object intersected by a ray
start in the voxel in which the ray originates
while the ray intersects no objects DO
for each object in the voxel
if the ray intersects with the object then
if intersection point is in the current voxel then
if the object is closest to the ray source then
object is the closest intersected by the ray
endif
endif
endif
endfor
move to the next voxel along the ray using 3DDDA
endwhile
endalgorithm
A ray that passes only through voxels that contain no objects will not be tested against any objects at all. In contrast the same ray with the old algorithm would have been tested against every object in the scene. The only time penalty is the cost of moving from voxel to voxel and the cost of preprocessing the objects in the scene to record which objects have bits of surface in each voxel. There is also a space penalty in that information on which objects have surface in each voxel has to be stored.
Critical to the success of the above scheme is a fast way of moving from voxel to voxel. The 3DDDA first proposed by Fujimoto and Iwata [FUJI85] provides a perfect way of achieving this.
Fujimoto and Iwata used two different ways of dividing a scene up into voxels. One of these involved the use of octrees, and the other divided the scene up into voxels of constant size. The second approach was chosen for the present implementation because Fujimoto and Iwata's results indicated that it was superior to the first.
One aspect of implementing 3DDDA involves imposing a three dimensional grid system on the objects comprising the scene. The grid system must enclose all the objects in the scene and be composed of a number of adjacent cubes of space all of which have to be the same size. These cubes are usually referred to as "voxels" and the grid system is constructed so that the edges of voxels are parallel to the x, y and z axis.
The Curtin Raytrace System implementation involved first calculating the maximum and minimum extent of all objects in the scene (including the view point and any light sources) then tailoring the unit voxel size so as to fit all the objects inside the voxel grid.
Spatially enumerating the scene means determining which objects have bits of surface in each voxel comprising the voxel grid. This information is determined once at the beginning of raytracing and then used millions of times during image generation.
Each class of object has a "voxelisation" function especially designed to determine which voxels contain bits of its surface. The concept of an object manager, as described in David Eddy's write-up [EDDY85], is used to provide a clean interface to each of these functions.
An ideal "voxelisation" function determines precisely those voxels containing bits of an objects surface. Only the "voxelate sphere" function, in the present implementation, is an ideal "voxelisation function". The functions for all other objects perform "crude" voxelisation - the object is added to the list of objects in a voxel if there is "any" chance at all that the voxel contains some of its surface.
A 3DDDA, as first proposed by Fujimoto and Iwata [FUJI85], was developed for efficient traversal from voxel to voxel along the path of a ray. Fujimota and Iwata provided the original ideas required for this development, however, they did not provide enough details to make implementation of the algorithm trivial.
The 3DDDA is a three dimensional extension of the two dimensional DDA used to identify those pixels that should be illuminated to display a line on a raster display. Where the DDA identifies pixels closest to a two dimensional line, the 3DDDA identifies voxels pierced by a three dimensional line.
Traditional DDA algorithms, have the concept of a "driving axis" and a "passive axis". The "driving axis" is incremented unconditionally while the "passive axis" is only incremented when an error term, usually measured at right angles to the "driving axis", falls outside some prescribed limit. The error term represents the distance, measured at right angles to the "driving axis", between the real line and the algorithm's estimate of the real line, which is the most recently illuminated pixel.
There are three main differences between traditional DDA algorithms and the 3DDDA used to identify those voxels pierced by a ray. The most obvious difference is the extra axis. Another difference is that the error terms, for each "passive axis", are measured parallel, rather than at right angles to the driving axis. The other is that the "driving axis" is no longer unconditionally incremented.
The extra axis is dealt with by retaining the concept of driving axis, but having two "passive axis" rather than one. The error terms, for the two "passive axis", are measured parallel to the driving axis so that they can be compared with one another. Measured in this way, the error term represents the distance, measured parallel to the driving axis, between the current position, and the point where the line crosses into the next cell in the passive axis.
DDA algorithms traditionally have as part of there structure an inner loop. The "driving axis" is unconditionally incremented each time around this loop, while the "passive axis" may or may not be incremented. This scheme does not work when extended to three dimensions, with one "driving axis" and two "passive axis". The problem is that both passive axis may require incrementing within one iteration of the loop. Unconditionally incrementing the "driving axis", under this circumstance, would result in incorrectly identifying the voxels pierced by a ray. An added advantage of not unconditionally incrementing the "driving axis" is that the algorithm works for the case where the "passive axis" increases faster than the active one.
With the 3DDDA, developed for the Raytrace System, the axis of greatest slope is designated the "driving axis" and the other two axis are designated "passive axis". An error term is maintained for each of the passive axis. This term represents the distance, measured in cell size units parallel to the driving axis, from the present position to the point where the ray passes into the next voxel in that axis. The present position is always a point on the ray which coincides with a boundary between voxels in the driving axis. By correctly maintaining the error terms, while moving step by step from voxel boundary to voxel boundary along the driving axis, the voxels pierced by a ray can be correctly identified.
The following pseudo-code shows the 3DDDA algorithm at a high level:
algorithm 3DDDA
Find out which voxel the ray starts in.
Calculate appropriate increments for each error term.
Calculate initial values for each error term.
while no intersection and still within voxel grid DO
if either passive axis has a negative error term then
if the first is more negative then
Move to the next voxel along that axis.
Increment the error term by appropriate factor.
else the second is more negative
Move to the next voxel along that axis.
Increment the error term by appropriate factor.
endif
else
Move to the next voxel along the driving axis.
Decrement the error term for each passive axis by 1.
endif
endwhile
endalgorithm
The following figures show the results of timing experiments conducted to compare the speed of the raytracing system before and after the incorporation of 3DDDA. Also shown is the effect on speed of varying the number of voxels a scene is divided into. Each image was rendered a number of times, firstly without 3DDDA (labeled control), and then with 3DDDA, with the scene divided into various numbers of voxels (labeled 10x10x10 etc).
All the pictures were rendered at a resolution of 400x400 without any antialiasing. Times shown are "user time" on a VAX 11/785 minicomputer with floating point co-processor. "User time" is the amount of time spent by the computer on the program. This differs from "elapsed time", which is the total time between the start and end of a program, and "CPU time" which is the total amount of time actually spent by the CPU on a program.
The "number of rays cast" is the total number of rays cast during image generation. Note that for a given image, this figure is constant, regardless of the number of voxels that the scene is divided up into. The constancy of this figure reflects the fact that the 3DDDA enhancements to the raytracing system, improve the speed of image generation, without effecting the quality of the final image.
The number of "intersection tests per ray cast" is the average number of "ray object intersection tests" per ray cast. Note that for the control renderings, this figure is more than the total number of objects in the scene. For the 3DDDA renderings, this figure is always less than the number of objects in the scene, and as the resolution of the voxel grid is increased, this figure goes down. The figure is included because it is in reducing the number of "intersection tests per ray cast" that the 3DDDA algorithm achieves its improvements in performance.
The most spectacular improvements in performance were obtained for "ddahat". This image took 35hr 29mins to generate using the old algorithm and only 43 minutes using 3DDDA with an 80x80x80 voxel grid. This represents a 50 fold increase in performance. In achieving these improvements, the number of intersection tests per ray cast was decreased from more than 1000, for the old algorithm, to less than 6, for 3DDDA with 80x80x80 voxel grid. Also impressive is the estimated 16 fold improvement for "ddatree". This improvement was achieved in spite of the fact that the voxelisation tests for cylinders and cones are relatively crude.
The timing data for "dda8balls" and "dlamp" are interesting because computation time is actually increased by 3DDDA. This can be explained by the small number of objects in these two scenes. With so few objects, the overheads associated with 3DDDA, must cost more than the savings made in reducing the number of ray object intersection tests. One way of taking advantage of these results might be to set up the raytracing system so that 3DDDA is only performed if a scene has more than a certain number of objects.
Each image was rendered a number of times, dividing the scene up into different numbers of voxels, to determine the effect of the number of voxels on rendering time. With the exception of the two images with a very small number of objects, all images showed decreased rendering time as the number of voxels was increased. The relationship between the two was not linear (or even logarithmic), however, with the improvements between 10x10x10 voxels and 20x20x20 being more marked than the improvements between 40x40x40 and 80x80x80 voxels. The relationship between image generation time and memory required for 3DDDA is a classic case of a space/time tradeoff. Increasing the number of voxels, requires the use of more memory, however results in faster image generation. Ultimately the number of voxels into which a scene should be divided will depend on available memory.
Timing data for "dda8balls". This scene consisted of 8
spheres and a single light source. Note that although the number
of intersection tests per ray cast was decreased by 3DDDA, total
computation time was increased. Because there are so few objects
in the scene, the overheads of voxel traversal exceed the savings
made in intersection tests.
Timing data for "dlamp" which consists of 11 objects of various types, including two light sources. 3DDDA again caused an increase in computation time in spite of reducing the number of intersection tests per ray cast. This is due to the small number of objects in the scene.
Fig. 6.5 : dda8balls (Tree
Resolution of Number of Intersect Tests Total
Voxel Grid Rays Cast Per Ray Cast Time
Control 349 742 9.10 11mins
10x10x10 349 742 1.27 14mins
20x20x20 - - -
30x30x30 349 742 1.25 14mins
40x40x40 349 742 1.15 15mins
80x80x80 349 742 1.08 20mins
depth=1)
Timing data for "dda125balls" which consists of 125 spheres and one light source. Total time was reduced by a factor of 16 for the 40x40x40 and 80x80x80 resolution 3DDDA.
Fig. 6.6 : dlamp (Tree depth=1)
Resolution of Number of Intersect Tests Total
Voxel Grid Rays Cast Per Ray Cast Time
Control - - -
10x10x10 229 096 4.79 17mins
20x20x20 - - -
30x30x30 229 096 3.56 17mins
40x40x40 229 096 4.54 19mins
80x80x80 229 096 3.34 22mins
Timing data for "ddatree" which consists of 384 objects, which are a mixture of cylinders and cones, and one light source. Rays are traced to a depth of two. The improvements in computation time due to 3DDDA are impressive in spite of the fact that the voxelisation tests for cylinders and cones are relatively crude in the present implementation. The algorithm used to generate the geometry of this picture was adapted from Aono and Kunii [AONI84].
Fig. 6.7 : dda125balls
Resolution of Number of Intersect Tests Total
Voxel Grid Rays Cast Per Ray Cast Time
Control 240 928 147.93 6hr 42mins
10x10x10 240 928 22.10 43mins
20x20x20 240 928 10.92 28mins
30x30x30 240 928 10.60 29mins
40x40x40 240 928 6.92 25mins
80x80x80 240 928 4.08 25mins
(Tree depth=1)
Timing data for "ddahat" which consists of 962 spheres and two light sources. Total time is reduced by a factor of 50 for the 80x80x80 resolution 3D- DDA.
Fig. 6.8 : ddatree (Tree depth=2)
Resolution of Number of Intersect Tests Total
Voxel Grid Rays Cast Per Ray Cast Time
Control 188 180 384+ (est) 24hr+ (est)
10x10x10 188 180 107.08 8hr 30mins
20x20x20 - - -
30x30x30 188 180 107.08 2hr 19mins
40x40x40 188 180 67.10 1hr 33mins
80x80x80 - - -
Timing data for "dda1000balls" which consists of 1000 spheres and one light source. According to the estimates, total time is reduced by a factor of at least 45 for the 80x80x80 3DDDA.
Fig. 6.9 : ddahat (Tree depth=1)
Resolution of Number of Intersect Tests Total
Voxel Grid Rays Cast Per Ray Cast Time
Control 249 162 1156.36 35hr 29mins
10x10x10 249 162 102.27 3hr 0mins
20x20x20 249 162 28.14 1hr 19mins
30x30x30 249 162 14.94 1hr 12mins
40x40x40 249 162 11.27 44mins
80x80x80 249 162 5.69 43mins
The figure below illustrates the relationship between number of objects and image rendering time for raytracing with and without 3DDDA. Because of the small sample size, it is impossible to determine the exact relationships, however it is clear from the table that the number of objects has much less an influence on rendering time for raytracing with 3DDDA than raytracing without. Compare for example the difference in rendering time between "dda125balls" and "ddahat" for raytracing with and without 3DDDA.
Fig. 6.10 : dda1000balls
Resolution of Number of Intersect Tests Total
Voxel Grid Rays Cast Per Ray Cast Time
Control 295 006 1001+ (est) 40hr+(est)
10x10x10 295 006 72.72 2hr 48mins
20x20x20 295 006 20.49 1hr 13mins
30x30x30 295 006 11.66 58mins
40x40x40 295 006 9.11 54mins
80x80x80 295 006 6.20 53mins
(Tree depth=1)
Fig. 6.11 : Timing Summary.
Name of Number of Total Time Total Time
Picture Objects Without 3DDDA With 3DDDA
dda8balls 9 11mins 20mins
dlamp 11 - 17mins
dda125balls 126 6hr 42mins 25mins
ddatree 384 24hr+ (est) 1hr 33mins
ddahat 962 35hr 29mins 43mins
dda1000balls 1002 40hr+ (est) 53mins