Project Proposal:
Parallel Ray Tracing with CUDA
Lingzhang Jiang
Main Project Page

I am going to implement a parallel ray tracer using Cuda.


Ray tracing is a method used for rendering graphics.

The difference between ray-tracing and rasterization (the traditional rendering method) is that ray-tracing employs a per-pixel approach while rasterization employs a per-object approach. The advantage of ray-tracing is that some characteristics of a scene like the shadows and reflections can be computed more accurately. The disadvantage is that it is usually slower.

The basic idea(in pseudocode) is as such:

for each pixel do
  compute viewing ray
  find first object hit by ray and its surface normal n
  set pixel color to value computed from hit point, light, and n


The overall structure of a ray-tracer is inherently parallel as it performs computations per pixel. This means that in theory, it should not be difficult to parallelize a ray tracer program.

The first challenge is to fit a serial ray-tracer program that runs on a CPU into CUDA syntax and have it function correctly. The second challenge is to achieve a respectable speedup with respect to the number of threads launched. Since there is a large divergent execution as some pixels can potentially require a lot more computation than other pixels, it will be difficult toobtain fair workload distribution.

The constraints are with GPU programming. First of all, there will be significant overhead as information has to be transmitted to the GPU before calculations can be performed. Second of all, unlike multi-core CPUs, there is limited communication between threads, making dynamic workload allocation difficult.


I will be using the serial ray tracer(possibly also a CPU-parallel version using Openmp) from 15-462. For the hardware, I will be using the NVIDIA GTX 670 in the Gates cluster. I will also refer to the CUDA reference by NVIDIA and 15-462 notes on ray-tracing.


Plan to Achieve

A functioning parallel ray tracer running on CUDA with respectable speedup over a completely serial implementation. Some benchmarks and analysis on the respective performances on a number of scenes and the possible reasons for this.

Hope to Achieve

Basically the same as above, with better speedup. Implementation of additional features into the basic ray tracer like anti-aliasing, also parallelized.


It makes sense to run a ray tracer on the GPU as the GPU has high bandwidth and computing power and can run a large number of threads in parallel. Also, as compared to perhaps using openmpi or blacklight, it is more resource-efficient if the GPU which can be found on a single machine can be utilized to make the ray-tracer run faster.

Proposed Schedule

Week What We Plan To Do
Apr 1-7Think of possible projects.
Apr 8-14Complete the basic ray tracer.
Apr 15-21Port the basic ray tracer to CUDA.
Apr 22-28Run some benchmarks to compare the two versions and optimize the CUDA version.
Apr 29-May 5Complete all optimizations and analysis.
May 6-11Add in cool features to the ray tracer.