Deferred Shading Heuristic
Tommy Klein (twklein)
I finished the deferred shader, which includes rasterizing light spheres
to cull un-necessary lights and ran some scenes to measure performance.
For the Sponza scene with 100 lights congregated on the same point,
I noticed that the performance of the two shaders varied significantly
with where the camera was positioned. When close to and
facing the wall where the 100 lights were, the forward renderer
measured 16 ms per frame while the deferred was actually worse, at 20 ms.
However, when positioned away from the wall and looking at the ceiling
(where light shading would not occur due to a distance check
in the the forward renderer and sphere rasterization in the
deferred shader), the deferred shader measured about 6 ms while
the forward renderer hung steady at 18 ms. So while the deferred
shader can offer significant speedup, there was a position where
it actually performed worse than the forward renderer. I also
ran the Suzanne model with 1000 lights with a large enough radii
to completely cover the model. I noticed an interesting pattern
where when viewed from far away, the forward renderer is almost twice
as slow as the deferred shader; at a moderate distance, the forward
renderer was only ~50% slower; and at a very close range (where the
model fills the screen) the forward renderer is again twice as slow
as the deferred.
By Tuesday, I will have the benchmarking system set up; Wednesday will
be configuring the policy settings and taking benchmarks, and Thursday
will be final presentation prep.
I am mostly on track per my Weeks 1 and 2 schedule. I currently
have an OpenGL renderer, and conventionally implemented
vertex/fragment shaders which can shade according to diffuse/specular
highlights from up to 1024 lights at the moment (however, this limit
can probably be increased). While rendering, the user is also able
to move and rotate the camera freely, a feature which I was not
expecting to complete until much later. Additionally, I also have
a .obj file loader, which will enable me to create scenes using
free online or educational models; I've tested it on a few models, such as
the "suzanne" monkey head model as well as a 130k vertex
Dodge Charger model. However, some of my core code I had hoped
to be done with in these two weeks is not polished enough; for instance,
my renderer can only support one texture at the moment, and the Dodge
Charger model's hood doesn't render, which I'm still debugging.
While I don't have any results, my harness is good enough that
it should be easy to test them (I'm able to render an image and time
how long it takes to render it, load arbitrary .obj files, and
configure my light settings programatically).
Re project comments:
1. The benchmark of suite scenes will be a mixture of a few types:
test models (like the suzanne monkey head and cornell box), random free
models such as the Dodge charger to randomly sample how my algorithm
performs on "real" models, and programatically generated scenes
to measure how aspects of the scene affect conventional/deferred
shading performance. I don't plan to perform light performance
optimizations such as light culling, since the project is more about
shader complexity vs. when to switch from conventional to deferred shading
(and a shader for various amounts of lights provides a flexible way
of tweaking the shader's complexity).
2. I will be implementing two versions of the shader by hand.
3. I'll be comparing policies by running a few frames of both forward
and conventional rendering before actually drawing anything to determine
what their performances would be (the policy wouldn't receive this
information); then, I would know the optimal
choice (and framerate) for the current view, and when the policy makes
its choice and perf measurements, compare that against the true best.
Then I could compare the true optimal framerates against what the user
would see based on the policy's choice.
My updated schedule:
Week 3: Debug rendering issues, create testing harness for running a
set of scenes with variable amounts of lighting.
Week 4: Write deferred shader. Start work on recording,replaying
user movements for perf profiling.
Checkpoint 2: Both deferred, conventional shaders will be working.
Test harness will be able to load, run scenes, take perf measurements,
and output results to a file. However, a scene at this point would
be a static snapshot; perf measurements would just be comparing
how long it took to render a snapshot, and there would be no policies yet.
Week 5: Finish system for recording, replaying user movements for profiling.
Test harness will now be able to use replayed user movements to
benchmark policy performance.
Week 6: Policy design, implementation, performance comparisons.
Presentation: Performance graphs; what policies perform best on which
scenes, and why? What scene parameters (such as how many samples end up
occluded) affect performance of the two rendering systems, and
to what extent?
The project goal is to dynamically switch between a deferred or regular
shading method, using a yet-to-be discovered heuristic for which
would run faster for the current scene.
Deferred shading is a popular method for rendering scenes. Instead of
shading each sample regardless of whether it will actually be visible,
deferred shading first determines what will be visible, and then shades.
This is useful when shading is expensive, and when many primitives would
have samples at the same location. However, it may not always be
faster than regular shading; deferred shading requires at least one more pass (once for depth, the other for actual shading) as well as writing
information needed for shading out to buffers
instead of shading as fragments are generated.
The project's challenge will be creating a fast heuristic for deciding which
shading method would be faster. This heuristic will try to model
how aspects of the scene affect the speeds of the rendering. For instance,
the heuristic may take into account:
-the number of lights in the scene, which would make shading calls more expensive
-the "depth density" of the scene (on average, for a given sample point, how many primitives would it?)
Another challenge would be whether switching between rendering methods is better at all. For instance, the time taken to calculate the heuristic may outweigh the speedup benefits the optimal shading strategy offers.
I plan to use OpenGL for rendering scenes, as well as for benchmarking
how aspects of scenes affect performance.
-An OpenGL renderer which can load scenes, and determine what shading
model would be best for the current scene and view (or whether to perform
the calculation at all).
-Graphs detailing how various aspects of the scene (number of lights,
number of primitives, and other scene aspects such as the "depth density" mentioned earlier) affect the performance of both conventional and deferred shading
-Explanation of what heuristics were tried, what worked best, and why
-At least one shader, implemented both conventionally and as a deferred shader
Week 1: Set up C++ codebase for an OpenGL renderer.
Week 2: Write benchmarking system, and begin benchmarking to determine
how properties of a scene effect performance of a conventionally implemented
Checkpoint 1: At this point, I should have a working OpenGL renderer
and the ability to generate benchmarking scenes. However, the camera
would be static and I would not be able to load actual scenes/models from
Week 3: Write a deferred shader
Week 4: Load actual scenes. Let user move camera around the scene.
Checkpoint 2: The project should be fully functional at this point,
with the ability to render scenes in both conventionally and using
a deferred shader. However, the shaders themselves may be very simple.
Week 5: Work on heuristic using benchmark results from previous results.
Week 6: Optimization, experimenting with different heuristics, more sophisticated shaders.
Presentation: Performance graphs, possibly a video or a live demo
if my laptop can handle it.