CG351-551 Raytracing: Algorithm for 3DDDA.

6.13. Algorithm for 3DDDA

There is a very simple algorithm to determine "which" objects in the scene a ray first intersects:

    algorithm to determine closest object intersected by a ray

      for each object in the scene

        if the ray intersects with the object then

           if the object is closest to the ray source then

             the object is the closest intersected by the ray

           endif

        endif

      endfor

    endalgorithm

A consequence of this algorithm is that each ray traced has to be tested against each object in the scene. Calculation time increases linearly (in fact even faster because of shadow testing) in proportion to the number of objects in the scene. To implement 3DDDA, the algorithm that determines "which object in the scene a ray intersects first" was changed to drastically reduce the number of ray intersection tests.

Basically the new algorithm is as follows:

    algorithm to determine closest object intersected by a ray

      start in the voxel in which the ray originates

      while the ray intersects no objects DO

        for each object in the voxel

          if the ray intersects with the object then

             if intersection point is in the current voxel then

               if the object is closest to the ray source then

                  object is the closest intersected by the ray

               endif

             endif

          endif

        endfor

        move to the next voxel along the ray using 3DDDA

      endwhile

    endalgorithm

With this algorithm, ray object intersection calculations are only performed where there is a good chance that the ray intersects with the object. What is more, once an object is found that intersects with the current ray, it is only necessary to perform intersection calculations for the remaining objects in the current voxel. Objects in more distant voxels, although they may intersect with the ray, could not possibly be closer to the origin of the ray than the object already intersected.

A ray that passes only through voxels that contain no objects will not be tested against any objects at all. In contrast the same ray with the old algorithm would have been tested against every object in the scene. The only time penalty is the cost of moving from voxel to voxel and the cost of preprocessing the objects in the scene to record which objects have bits of surface in each voxel. There is also a space penalty in that information on which objects have surface in each voxel has to be stored.

Critical to the success of the above scheme is a fast way of moving from voxel to voxel. The 3DDDA first proposed by Fujimoto and Iwata [FUJI85] provides a perfect way of achieving this.

Fujimoto and Iwata used two different ways of dividing a scene up into voxels. One of these involved the use of octrees, and the other divided the scene up into voxels of constant size. The second approach was chosen for the present implementation because Fujimoto and Iwata's results indicated that it was superior to the first.

6.13.1. Defining the voxel grid

One aspect of implementing 3DDDA involves imposing a three dimensional grid system on the objects comprising the scene. The grid system must enclose all the objects in the scene and be composed of a number of adjacent cubes of space all of which have to be the same size. These cubes are usually referred to as "voxels" and the grid system is constructed so that the edges of voxels are parallel to the x, y and z axis.

The Curtin Raytrace System implementation involved first calculating the maximum and minimum extent of all objects in the scene (including the view point and any light sources) then tailoring the unit voxel size so as to fit all the objects inside the voxel grid.

6.13.2. Spatially enumerating the scene

Spatially enumerating the scene means determining which objects have bits of surface in each voxel comprising the voxel grid. This information is determined once at the beginning of raytracing and then used millions of times during image generation.

Each class of object has a "voxelisation" function especially designed to determine which voxels contain bits of its surface. The concept of an object manager, as described in David Eddy's write-up [EDDY85], is used to provide a clean interface to each of these functions.

An ideal "voxelisation" function determines precisely those voxels containing bits of an objects surface. Only the "voxelate sphere" function, in the present implementation, is an ideal "voxelisation function". The functions for all other objects perform "crude" voxelisation - the object is added to the list of objects in a voxel if there is "any" chance at all that the voxel contains some of its surface.

6.13.3. Traversing from voxel to voxel

A 3DDDA, as first proposed by Fujimoto and Iwata [FUJI85], was developed for efficient traversal from voxel to voxel along the path of a ray. Fujimota and Iwata provided the original ideas required for this development, however, they did not provide enough details to make implementation of the algorithm trivial.

The 3DDDA is a three dimensional extension of the two dimensional DDA used to identify those pixels that should be illuminated to display a line on a raster display. Where the DDA identifies pixels closest to a two dimensional line, the 3DDDA identifies voxels pierced by a three dimensional line.

Traditional DDA algorithms, have the concept of a "driving axis" and a "passive axis". The "driving axis" is incremented unconditionally while the "passive axis" is only incremented when an error term, usually measured at right angles to the "driving axis", falls outside some prescribed limit. The error term represents the distance, measured at right angles to the "driving axis", between the real line and the algorithm's estimate of the real line, which is the most recently illuminated pixel.

There are three main differences between traditional DDA algorithms and the 3DDDA used to identify those voxels pierced by a ray. The most obvious difference is the extra axis. Another difference is that the error terms, for each "passive axis", are measured parallel, rather than at right angles to the driving axis. The other is that the "driving axis" is no longer unconditionally incremented.

The extra axis is dealt with by retaining the concept of driving axis, but having two "passive axis" rather than one. The error terms, for the two "passive axis", are measured parallel to the driving axis so that they can be compared with one another. Measured in this way, the error term represents the distance, measured parallel to the driving axis, between the current position, and the point where the line crosses into the next cell in the passive axis.

DDA algorithms traditionally have as part of there structure an inner loop. The "driving axis" is unconditionally incremented each time around this loop, while the "passive axis" may or may not be incremented. This scheme does not work when extended to three dimensions, with one "driving axis" and two "passive axis". The problem is that both passive axis may require incrementing within one iteration of the loop. Unconditionally incrementing the "driving axis", under this circumstance, would result in incorrectly identifying the voxels pierced by a ray. An added advantage of not unconditionally incrementing the "driving axis" is that the algorithm works for the case where the "passive axis" increases faster than the active one.

With the 3DDDA, developed for the Raytrace System, the axis of greatest slope is designated the "driving axis" and the other two axis are designated "passive axis". An error term is maintained for each of the passive axis. This term represents the distance, measured in cell size units parallel to the driving axis, from the present position to the point where the ray passes into the next voxel in that axis. The present position is always a point on the ray which coincides with a boundary between voxels in the driving axis. By correctly maintaining the error terms, while moving step by step from voxel boundary to voxel boundary along the driving axis, the voxels pierced by a ray can be correctly identified.

The following pseudo-code shows the 3DDDA algorithm at a high level:

    algorithm 3DDDA

      Find out which voxel the ray starts in.

      Calculate appropriate increments for each error term.

      Calculate initial values for each error term.

      while no intersection and still within voxel grid DO

        if either passive axis has a negative error term then

          if the first is more negative then

            Move to the next voxel along that axis.

            Increment the error term by appropriate factor.

          else the second is more negative

             Move to the next voxel along that axis.

             Increment the error term by appropriate factor.

          endif

        else

          Move to the next voxel along the driving axis.

          Decrement the error term for each passive axis by 1.

        endif

      endwhile

    endalgorithm

The appropriate factor by which to increment the error term for one of the passive axis is the amount the driving axis changes, while the passive axis changes one voxel width. More specifically, this factor is the reciprocal of the gradient of the ray expressed in terms of the passive axis over the driving axis. Initial values for each error term reflect the distance, measured in voxel size units parallel to the driving axis, between the point that the ray crosses into the next voxel in the driving axis and the point that the ray crosses into the next voxel in the passive axis.

6.13.4. Results and discussion

The following figures show the results of timing experiments conducted to compare the speed of the raytracing system before and after the incorporation of 3DDDA. Also shown is the effect on speed of varying the number of voxels a scene is divided into. Each image was rendered a number of times, firstly without 3DDDA (labeled control), and then with 3DDDA, with the scene divided into various numbers of voxels (labeled 10x10x10 etc).

All the pictures were rendered at a resolution of 400x400 without any antialiasing. Times shown are "user time" on a VAX 11/785 minicomputer with floating point co-processor. "User time" is the amount of time spent by the computer on the program. This differs from "elapsed time", which is the total time between the start and end of a program, and "CPU time" which is the total amount of time actually spent by the CPU on a program.

The "number of rays cast" is the total number of rays cast during image generation. Note that for a given image, this figure is constant, regardless of the number of voxels that the scene is divided up into. The constancy of this figure reflects the fact that the 3DDDA enhancements to the raytracing system, improve the speed of image generation, without effecting the quality of the final image.

The number of "intersection tests per ray cast" is the average number of "ray object intersection tests" per ray cast. Note that for the control renderings, this figure is more than the total number of objects in the scene. For the 3DDDA renderings, this figure is always less than the number of objects in the scene, and as the resolution of the voxel grid is increased, this figure goes down. The figure is included because it is in reducing the number of "intersection tests per ray cast" that the 3DDDA algorithm achieves its improvements in performance.

The most spectacular improvements in performance were obtained for "ddahat". This image took 35hr 29mins to generate using the old algorithm and only 43 minutes using 3DDDA with an 80x80x80 voxel grid. This represents a 50 fold increase in performance. In achieving these improvements, the number of intersection tests per ray cast was decreased from more than 1000, for the old algorithm, to less than 6, for 3DDDA with 80x80x80 voxel grid. Also impressive is the estimated 16 fold improvement for "ddatree". This improvement was achieved in spite of the fact that the voxelisation tests for cylinders and cones are relatively crude.

The timing data for "dda8balls" and "dlamp" are interesting because computation time is actually increased by 3DDDA. This can be explained by the small number of objects in these two scenes. With so few objects, the overheads associated with 3DDDA, must cost more than the savings made in reducing the number of ray object intersection tests. One way of taking advantage of these results might be to set up the raytracing system so that 3DDDA is only performed if a scene has more than a certain number of objects.

Each image was rendered a number of times, dividing the scene up into different numbers of voxels, to determine the effect of the number of voxels on rendering time. With the exception of the two images with a very small number of objects, all images showed decreased rendering time as the number of voxels was increased. The relationship between the two was not linear (or even logarithmic), however, with the improvements between 10x10x10 voxels and 20x20x20 being more marked than the improvements between 40x40x40 and 80x80x80 voxels. The relationship between image generation time and memory required for 3DDDA is a classic case of a space/time tradeoff. Increasing the number of voxels, requires the use of more memory, however results in faster image generation. Ultimately the number of voxels into which a scene should be divided will depend on available memory.

6.13.5. Timings.

Timing data for "dda8balls". This scene consisted of 8 spheres and a single light source. Note that although the number of intersection tests per ray cast was decreased by 3DDDA, total computation time was increased. Because there are so few objects in the scene, the overheads of voxel traversal exceed the savings made in intersection tests.


         

 
        Resolution of   Number of   Intersect Tests   Total

        Voxel Grid      Rays Cast   Per Ray Cast      Time

        Control         349 742     9.10              11mins

        10x10x10        349 742     1.27              14mins

        20x20x20        -           -                 -

        30x30x30        349 742     1.25              14mins

        40x40x40        349 742     1.15              15mins

        80x80x80        349 742     1.08              20mins

           Fig. 6.5 : dda8balls (Tree

                              depth=1)

Timing data for "dlamp" which consists of 11 objects of various types, including two light sources. 3DDDA again caused an increase in computation time in spite of reducing the number of intersection tests per ray cast. This is due to the small number of objects in the scene.


         

 
        Resolution of   Number of   Intersect Tests   Total

        Voxel Grid      Rays Cast   Per Ray Cast      Time

        Control         -           -                 -

        10x10x10        229 096     4.79              17mins

        20x20x20        -           -                 -

        30x30x30        229 096     3.56              17mins

        40x40x40        229 096     4.54              19mins

        80x80x80        229 096     3.34              22mins

           Fig. 6.6 : dlamp (Tree depth=1)

Timing data for "dda125balls" which consists of 125 spheres and one light source. Total time was reduced by a factor of 16 for the 40x40x40 and 80x80x80 resolution 3DDDA.


         

 
      Resolution of   Number of   Intersect Tests   Total

      Voxel Grid      Rays Cast   Per Ray Cast      Time

      Control         240 928     147.93            6hr 42mins

      10x10x10        240 928     22.10             43mins

      20x20x20        240 928     10.92             28mins

      30x30x30        240 928     10.60             29mins

      40x40x40        240 928     6.92              25mins

      80x80x80        240 928     4.08              25mins

               Fig. 6.7 : dda125balls

                           (Tree depth=1)

Timing data for "ddatree" which consists of 384 objects, which are a mixture of cylinders and cones, and one light source. Rays are traced to a depth of two. The improvements in computation time due to 3DDDA are impressive in spite of the fact that the voxelisation tests for cylinders and cones are relatively crude in the present implementation. The algorithm used to generate the geometry of this picture was adapted from Aono and Kunii [AONI84].


         

 
      Resolution of   Number of   Intersect Tests   Total

      Voxel Grid      Rays Cast   Per Ray Cast      Time

      Control         188 180     384+ (est)        24hr+ (est)

      10x10x10        188 180     107.08            8hr 30mins

      20x20x20        -           -                 -

      30x30x30        188 180     107.08            2hr 19mins

      40x40x40        188 180     67.10             1hr 33mins

      80x80x80        -           -                 -

           Fig. 6.8 : ddatree (Tree depth=2)

Timing data for "ddahat" which consists of 962 spheres and two light sources. Total time is reduced by a factor of 50 for the 80x80x80 resolution 3D- DDA.


         

 
      Resolution of   Number of   Intersect Tests   Total

      Voxel Grid      Rays Cast   Per Ray Cast      Time

      Control         249 162     1156.36           35hr 29mins

      10x10x10        249 162     102.27            3hr 0mins

      20x20x20        249 162     28.14             1hr 19mins

      30x30x30        249 162     14.94             1hr 12mins

      40x40x40        249 162     11.27             44mins

      80x80x80        249 162     5.69              43mins

           Fig. 6.9 : ddahat (Tree depth=1)

Timing data for "dda1000balls" which consists of 1000 spheres and one light source. According to the estimates, total time is reduced by a factor of at least 45 for the 80x80x80 3DDDA.


         

 
      Resolution of   Number of   Intersect Tests   Total

      Voxel Grid      Rays Cast   Per Ray Cast      Time

      Control         295 006     1001+ (est)       40hr+(est)

      10x10x10        295 006     72.72             2hr 48mins

      20x20x20        295 006     20.49             1hr 13mins

      30x30x30        295 006     11.66             58mins

      40x40x40        295 006     9.11              54mins

      80x80x80        295 006     6.20              53mins

             Fig. 6.10 : dda1000balls

                           (Tree depth=1)

The figure below illustrates the relationship between number of objects and image rendering time for raytracing with and without 3DDDA. Because of the small sample size, it is impossible to determine the exact relationships, however it is clear from the table that the number of objects has much less an influence on rendering time for raytracing with 3DDDA than raytracing without. Compare for example the difference in rendering time between "dda125balls" and "ddahat" for raytracing with and without 3DDDA.


         

 
        Name of        Number of   Total Time      Total Time

        Picture        Objects     Without 3DDDA   With 3DDDA

        dda8balls      9           11mins          20mins

        dlamp          11          -               17mins

        dda125balls    126         6hr 42mins      25mins

        ddatree        384         24hr+ (est)     1hr 33mins

        ddahat         962         35hr 29mins     43mins

        dda1000balls   1002        40hr+ (est)     53mins

           Fig. 6.11 : Timing Summary.