Assignment 3: Neural Volume Rendering and Surface Rendering

Part 0: Transmittance Calculation (10 points)

Task: Compute the transmittance of a ray going through a non-homogeneous medium.

Transmittance Calculation Solution

Part 1: Differentiable Volume Rendering (30 points)

1.3 Ray Sampling (5 points)

XY Grid Visualization

Rays Visualization

1.4 Point Sampling (5 points)

Sample Points from First Camera

1.5 Volume Rendering (20 points)

Volume Rendering

Depth Map (Camera 2)

Part 2: Optimizing a Basic Implicit Volume (10 points)

Optimized Box Rendering

Box Center: (0.25, 0.25, -0.00)

Box Side Lengths: (2.01, 1.50, 1.50)

Part 3: Optimizing a Neural Radiance Field (NeRF) (20 points)

Trained NeRF on Lego Dataset

Part 4: NeRF Extras (10 points)

4.1 View Dependence

View-Dependent NeRF on Materials Scene

Trade-offs Discussion:

View dependence allows the model to capture specific effects based on the camera's location, such as reflections and emission from glossy areas on materials. View dependence increases the input dimensionality, which makes the relationship more difficult to learn and can cause overfitting if view-dependent effects are incorrectly generalized to novel viewpoints.

Part 5: Sphere Tracing (10 points)

Torus Rendered with Sphere Tracing

Implementation Description:

Sphere tracing iteratively marches along each ray by the signed distance value returned by the SDF. Starting from the ray origin, we query the SDF at the current position and step forward by that distance. This process repeats until either we get close enough to the surface (distance < threshold) or we exceed the maximum number of iterations. A mask tracks which rays successfully intersected the surface.

Part 6: Optimizing a Neural SDF (15 points)

Input Point Cloud (Bunny)

Learned Neural SDF

MLP Architecture:

The neural SDF uses an MLP with 6 layers and 128 hidden neurons. The input 3D coordinates are processed through harmonic embedding to capture high-frequency details. The network outputs a single signed distance value per point.

Eikonal Loss:

The eikonal loss enforces that the gradient of the SDF has unit magnitude (||∇SDF|| = 1), which is a fundamental property of true signed distance fields. This is implemented by adding to the loss the mean squared error between the gradient of the signed distance function and one.

Part 7: VolSDF (15 points)

Bulldozer Geometry (SDF Surface)

Bulldozer with Color

SDF to Density Conversion Parameters

Alpha and Beta Intuition:

Alpha (α): Controls the overall density magnitude. Higher α creates sharper density transitions around the surface.

Beta (β): Controls the density distribution around the zero-level set of the SDF. It determines how quickly density falls off as you move away from the surface.

Questions

Q1: How does high/low beta bias your learned SDF?

High β creates a wider, smoother density distribution around the surface, making the SDF less sharp. Low β creates a narrower, sharper density distribution, biasing the SDF toward more precise surface localization.

Q2: Would an SDF be easier to train with low or high beta? Why?

Higher β is easier to train because the wider density distribution provides stronger gradients over a larger region of space, giving the optimization more signal. Low β can lead to vanishing gradients far from the surface.

Q3: Would you learn a more accurate surface with high or low beta? Why?

A more accurate surface will be learned with a lower beta, as a lower beta allows for a smaller density distribution around the zero of the SDF, creating more precise surface positions.

Hyperparameters Used:

Alpha: 10.0
Beta: 0.05
Learning rate: 0.0005
Epochs: 250

Part 8: Neural Surface Extras (10 points)

8.2 Fewer Training Views

Comparison of NeRF vs VolSDF performance with limited training views.

10 Training Views

NeRF (10 views)

VolSDF (10 views)

20 Training Views

NeRF (20 views)

VolSDF (20 views)

Analysis:

VolSDF generates a better overall global surface with fewer artifacts compared to NeRF. In the NeRF 20-view results, artifacts can be seen that differ from the actual Lego structure. NeRF captures details very clearly where there is camera input data, but appears blurry in regions without training data. VolSDF gets general features correct globally, though details are slightly more blurred. Overall, VolSDF maintains smoother global surfaces with fewer artifacts, while NeRF excels at capturing fine details from its input views.