Rohan Nagabhirava
CMU 16-889 Learning for 3D Vision
Fall 2025
Task: Compute the transmittance of a ray going through a non-homogeneous medium.
Transmittance Calculation Solution
XY Grid Visualization
Rays Visualization
Sample Points from First Camera
Volume Rendering
Depth Map (Camera 2)
Optimized Box Rendering
Box Center: (0.25, 0.25, -0.00)
Box Side Lengths: (2.01, 1.50, 1.50)
Trained NeRF on Lego Dataset
View-Dependent NeRF on Materials Scene
Trade-offs Discussion:
View dependence allows the model to capture specific effects based on the camera's location, such as reflections and emission from glossy areas on materials. View dependence increases the input dimensionality, which makes the relationship more difficult to learn and can cause overfitting if view-dependent effects are incorrectly generalized to novel viewpoints.
Torus Rendered with Sphere Tracing
Implementation Description:
Sphere tracing iteratively marches along each ray by the signed distance value returned by the SDF. Starting from the ray origin, we query the SDF at the current position and step forward by that distance. This process repeats until either we get close enough to the surface (distance < threshold) or we exceed the maximum number of iterations. A mask tracks which rays successfully intersected the surface.
Input Point Cloud (Bunny)
Learned Neural SDF
MLP Architecture:
The neural SDF uses an MLP with 6 layers and 128 hidden neurons. The input 3D coordinates are processed through harmonic embedding to capture high-frequency details. The network outputs a single signed distance value per point.
Eikonal Loss:
The eikonal loss enforces that the gradient of the SDF has unit magnitude (||∇SDF|| = 1), which is a fundamental property of true signed distance fields. This is implemented by adding to the loss the mean squared error between the gradient of the signed distance function and one.
Bulldozer Geometry (SDF Surface)
Bulldozer with Color
Alpha and Beta Intuition:
Alpha (α): Controls the overall density magnitude. Higher α creates sharper density transitions around the surface.
Beta (β): Controls the density distribution around the zero-level set of the SDF. It determines how quickly density falls off as you move away from the surface.
Q1: How does high/low beta bias your learned SDF?
High β creates a wider, smoother density distribution around the surface, making the SDF less sharp. Low β creates a narrower, sharper density distribution, biasing the SDF toward more precise surface localization.
Q2: Would an SDF be easier to train with low or high beta? Why?
Higher β is easier to train because the wider density distribution provides stronger gradients over a larger region of space, giving the optimization more signal. Low β can lead to vanishing gradients far from the surface.
Q3: Would you learn a more accurate surface with high or low beta? Why?
A more accurate surface will be learned with a lower beta, as a lower beta allows for a smaller density distribution around the zero of the SDF, creating more precise surface positions.
Hyperparameters Used:
Comparison of NeRF vs VolSDF performance with limited training views.
NeRF (10 views)
VolSDF (10 views)
NeRF (20 views)
VolSDF (20 views)
Analysis:
VolSDF generates a better overall global surface with fewer artifacts compared to NeRF. In the NeRF 20-view results, artifacts can be seen that differ from the actual Lego structure. NeRF captures details very clearly where there is camera input data, but appears blurry in regions without training data. VolSDF gets general features correct globally, though details are slightly more blurred. Overall, VolSDF maintains smoother global surfaces with fewer artifacts, while NeRF excels at capturing fine details from its input views.