Assignment 3: Neural Volume Rendering and Surface Rendering

Rohan Nagabhirava

CMU 16-889 Learning for 3D Vision

Fall 2025

Part 0: Transmittance Calculation (10 points)

Task: Compute the transmittance of a ray going through a non-homogeneous medium.

Transmittance Calculation

Transmittance Calculation Solution

Part 1: Differentiable Volume Rendering (30 points)

1.3 Ray Sampling (5 points)

XY Grid Visualization

XY Grid Visualization

Rays Visualization

Rays Visualization

1.4 Point Sampling (5 points)

Sample Points Visualization

Sample Points from First Camera

1.5 Volume Rendering (20 points)

Part 1 Volume Rendering

Volume Rendering

Depth Map

Depth Map (Camera 2)

Part 2: Optimizing a Basic Implicit Volume (10 points)

Part 2 Optimized Volume

Optimized Box Rendering

Box Center: (0.25, 0.25, -0.00)

Box Side Lengths: (2.01, 1.50, 1.50)

Part 3: Optimizing a Neural Radiance Field (NeRF) (20 points)

Part 3 NeRF

Trained NeRF on Lego Dataset

Part 4: NeRF Extras (10 points)

4.1 View Dependence

Part 4.1 View Dependent NeRF

View-Dependent NeRF on Materials Scene

Trade-offs Discussion:

View dependence allows the model to capture specific effects based on the camera's location, such as reflections and emission from glossy areas on materials. View dependence increases the input dimensionality, which makes the relationship more difficult to learn and can cause overfitting if view-dependent effects are incorrectly generalized to novel viewpoints.

Part 5: Sphere Tracing (10 points)

Part 5 Sphere Tracing

Torus Rendered with Sphere Tracing

Implementation Description:

Sphere tracing iteratively marches along each ray by the signed distance value returned by the SDF. Starting from the ray origin, we query the SDF at the current position and step forward by that distance. This process repeats until either we get close enough to the surface (distance < threshold) or we exceed the maximum number of iterations. A mask tracks which rays successfully intersected the surface.

Part 6: Optimizing a Neural SDF (15 points)

Part 6 Input Point Cloud

Input Point Cloud (Bunny)

Part 6 Learned SDF

Learned Neural SDF

MLP Architecture:

The neural SDF uses an MLP with 6 layers and 128 hidden neurons. The input 3D coordinates are processed through harmonic embedding to capture high-frequency details. The network outputs a single signed distance value per point.

Eikonal Loss:

The eikonal loss enforces that the gradient of the SDF has unit magnitude (||∇SDF|| = 1), which is a fundamental property of true signed distance fields. This is implemented by adding to the loss the mean squared error between the gradient of the signed distance function and one.

Part 7: VolSDF (15 points)

Part 7 Geometry

Bulldozer Geometry (SDF Surface)

Part 7 Colored

Bulldozer with Color

SDF to Density Conversion Parameters

Alpha and Beta Intuition:

Alpha (α): Controls the overall density magnitude. Higher α creates sharper density transitions around the surface.

Beta (β): Controls the density distribution around the zero-level set of the SDF. It determines how quickly density falls off as you move away from the surface.

Questions

Q1: How does high/low beta bias your learned SDF?

High β creates a wider, smoother density distribution around the surface, making the SDF less sharp. Low β creates a narrower, sharper density distribution, biasing the SDF toward more precise surface localization.

Q2: Would an SDF be easier to train with low or high beta? Why?

Higher β is easier to train because the wider density distribution provides stronger gradients over a larger region of space, giving the optimization more signal. Low β can lead to vanishing gradients far from the surface.

Q3: Would you learn a more accurate surface with high or low beta? Why?

A more accurate surface will be learned with a lower beta, as a lower beta allows for a smaller density distribution around the zero of the SDF, creating more precise surface positions.

Hyperparameters Used:

  • Alpha: 10.0
  • Beta: 0.05
  • Learning rate: 0.0005
  • Epochs: 250

Part 8: Neural Surface Extras (10 points)

8.2 Fewer Training Views

Comparison of NeRF vs VolSDF performance with limited training views.

10 Training Views

NeRF 10 Views

NeRF (10 views)

VolSDF 10 Views

VolSDF (10 views)

20 Training Views

NeRF 20 Views

NeRF (20 views)

VolSDF 20 Views

VolSDF (20 views)

Analysis:

VolSDF generates a better overall global surface with fewer artifacts compared to NeRF. In the NeRF 20-view results, artifacts can be seen that differ from the actual Lego structure. NeRF captures details very clearly where there is camera input data, but appears blurry in regions without training data. VolSDF gets general features correct globally, though details are slightly more blurred. Overall, VolSDF maintains smoother global surfaces with fewer artifacts, while NeRF excels at capturing fine details from its input views.