Deferred Shading Heuristic

Tommy Klein (twklein)

Final Writeup

Writeup PDF

Checkpoint Two

I finished the deferred shader, which includes rasterizing light spheres to cull un-necessary lights and ran some scenes to measure performance. For the Sponza scene with 100 lights congregated on the same point, I noticed that the performance of the two shaders varied significantly with where the camera was positioned. When close to and facing the wall where the 100 lights were, the forward renderer measured 16 ms per frame while the deferred was actually worse, at 20 ms. However, when positioned away from the wall and looking at the ceiling (where light shading would not occur due to a distance check in the the forward renderer and sphere rasterization in the deferred shader), the deferred shader measured about 6 ms while the forward renderer hung steady at 18 ms. So while the deferred shader can offer significant speedup, there was a position where it actually performed worse than the forward renderer. I also ran the Suzanne model with 1000 lights with a large enough radii to completely cover the model. I noticed an interesting pattern where when viewed from far away, the forward renderer is almost twice as slow as the deferred shader; at a moderate distance, the forward renderer was only ~50% slower; and at a very close range (where the model fills the screen) the forward renderer is again twice as slow as the deferred.
By Tuesday, I will have the benchmarking system set up; Wednesday will be configuring the policy settings and taking benchmarks, and Thursday will be final presentation prep.

Checkpoint One

I am mostly on track per my Weeks 1 and 2 schedule. I currently have an OpenGL renderer, and conventionally implemented vertex/fragment shaders which can shade according to diffuse/specular highlights from up to 1024 lights at the moment (however, this limit can probably be increased). While rendering, the user is also able to move and rotate the camera freely, a feature which I was not expecting to complete until much later. Additionally, I also have a .obj file loader, which will enable me to create scenes using free online or educational models; I've tested it on a few models, such as the "suzanne" monkey head model as well as a 130k vertex Dodge Charger model. However, some of my core code I had hoped to be done with in these two weeks is not polished enough; for instance, my renderer can only support one texture at the moment, and the Dodge Charger model's hood doesn't render, which I'm still debugging. While I don't have any results, my harness is good enough that it should be easy to test them (I'm able to render an image and time how long it takes to render it, load arbitrary .obj files, and configure my light settings programatically).
Re project comments:
1. The benchmark of suite scenes will be a mixture of a few types: test models (like the suzanne monkey head and cornell box), random free models such as the Dodge charger to randomly sample how my algorithm performs on "real" models, and programatically generated scenes to measure how aspects of the scene affect conventional/deferred shading performance. I don't plan to perform light performance optimizations such as light culling, since the project is more about shader complexity vs. when to switch from conventional to deferred shading (and a shader for various amounts of lights provides a flexible way of tweaking the shader's complexity).
2. I will be implementing two versions of the shader by hand.
3. I'll be comparing policies by running a few frames of both forward and conventional rendering before actually drawing anything to determine what their performances would be (the policy wouldn't receive this information); then, I would know the optimal choice (and framerate) for the current view, and when the policy makes its choice and perf measurements, compare that against the true best. Then I could compare the true optimal framerates against what the user would see based on the policy's choice.
My updated schedule: Week 3: Debug rendering issues, create testing harness for running a set of scenes with variable amounts of lighting.
Week 4: Write deferred shader. Start work on recording,replaying user movements for perf profiling.
Checkpoint 2: Both deferred, conventional shaders will be working. Test harness will be able to load, run scenes, take perf measurements, and output results to a file. However, a scene at this point would be a static snapshot; perf measurements would just be comparing how long it took to render a snapshot, and there would be no policies yet.
Week 5: Finish system for recording, replaying user movements for profiling. Test harness will now be able to use replayed user movements to benchmark policy performance.
Week 6: Policy design, implementation, performance comparisons.
Presentation: Performance graphs; what policies perform best on which scenes, and why? What scene parameters (such as how many samples end up occluded) affect performance of the two rendering systems, and to what extent?

Summary

The project goal is to dynamically switch between a deferred or regular shading method, using a yet-to-be discovered heuristic for which would run faster for the current scene.

Background

Deferred shading is a popular method for rendering scenes. Instead of shading each sample regardless of whether it will actually be visible, deferred shading first determines what will be visible, and then shades. This is useful when shading is expensive, and when many primitives would have samples at the same location. However, it may not always be faster than regular shading; deferred shading requires at least one more pass (once for depth, the other for actual shading) as well as writing information needed for shading out to buffers instead of shading as fragments are generated.

Challenge

The project's challenge will be creating a fast heuristic for deciding which shading method would be faster. This heuristic will try to model how aspects of the scene affect the speeds of the rendering. For instance, the heuristic may take into account:
-the number of lights in the scene, which would make shading calls more expensive
-the "depth density" of the scene (on average, for a given sample point, how many primitives would it?)
Another challenge would be whether switching between rendering methods is better at all. For instance, the time taken to calculate the heuristic may outweigh the speedup benefits the optimal shading strategy offers.

Resources

I plan to use OpenGL for rendering scenes, as well as for benchmarking how aspects of scenes affect performance.

Goals/Deliverables

-An OpenGL renderer which can load scenes, and determine what shading model would be best for the current scene and view (or whether to perform the calculation at all).
-Graphs detailing how various aspects of the scene (number of lights, number of primitives, and other scene aspects such as the "depth density" mentioned earlier) affect the performance of both conventional and deferred shading
-Explanation of what heuristics were tried, what worked best, and why
-At least one shader, implemented both conventionally and as a deferred shader

Schedule

Week 1: Set up C++ codebase for an OpenGL renderer.
Week 2: Write benchmarking system, and begin benchmarking to determine how properties of a scene effect performance of a conventionally implemented shader.
Checkpoint 1: At this point, I should have a working OpenGL renderer and the ability to generate benchmarking scenes. However, the camera would be static and I would not be able to load actual scenes/models from a file.
Week 3: Write a deferred shader
Week 4: Load actual scenes. Let user move camera around the scene.
Checkpoint 2: The project should be fully functional at this point, with the ability to render scenes in both conventionally and using a deferred shader. However, the shaders themselves may be very simple.
Week 5: Work on heuristic using benchmark results from previous results.
Week 6: Optimization, experimenting with different heuristics, more sophisticated shaders.
Presentation: Performance graphs, possibly a video or a live demo if my laptop can handle it.