A. Neural Volume Rendering (80 points)¶
0. Transmittance Calculation (10 points)¶
1.4. Point sampling (5 points)¶
1.5. Volume rendering (20 points)¶
| Color Visualization | Depth Visualization |
|---|---|
2. Optimizing a basic implicit volume¶
2.1. Random ray sampling (5 points)¶
# Random subsampling of pixels from an image
def get_random_pixels_from_image(n_pixels, image_size, camera):
xy_grid = get_pixels_from_image(image_size, camera)
# TODO (Q2.1): Random subsampling of pixel coordinaters
xy_grid_sub = xy_grid.unsqueeze(1)
xy_grid_sub = xy_grid_sub.index_select(0, torch.randperm(xy_grid_sub.shape[0]))
# Return
return xy_grid_sub.reshape(-1, 2)[:n_pixels]
2.2. Loss and training (5 points) & 2.3 Visualization¶
3. Optimizing a Neural Radiance Field (NeRF) (20 points)¶
| Epochs=50 | Epochs=100 | Epochs=150 | Epochs=200 |
|---|---|---|---|
4. NeRF Extras (CHOOSE ONE! More than one is extra credit)¶
4.1 View Dependence (10 points)¶
LOW RESOLUTION
| Without view dependence (50 epcoch) | Without view dependence (100 epcoch) | Without view dependence (150 epcoch) | Without view dependence (200 epcoch) |
|---|---|---|---|
| With view dependence (50 epcoch) | With view dependence (100 epcoch) | With view dependence (150 epcoch) | With view dependence (200 epcoch) |
HIGH RESOLUTION
| Without view dependence (50 epcoch) | Without view dependence (100 epcoch) | Without view dependence (150 epcoch) | Without view dependence (200 epcoch) |
|---|---|---|---|
| With view dependence (50 epcoch) | With view dependence (100 epcoch) | With view dependence (150 epcoch) | With view dependence (200 epcoch) |
B. Neural Surface Rendering (50 points)¶
5. Sphere Tracing (10 points)¶
Sphere Tracing Algorithm:
Sphere tracing is an iterative ray marching algorithm used to find ray-surface intersections for implicit surfaces defined by Signed Distance Functions (SDFs). The key insight is that the SDF value at any point gives us a safe step size - we can move along the ray by exactly the SDF distance without overshooting the surface.
Algorithm Steps:
- Initialize: Start each ray at its origin point
- Iterate: For each active ray:
- Query the SDF at the current point
- If |SDF| < ε (threshold), mark as hit and deactivate
- If t > far distance, mark as miss and deactivate
- Otherwise, step forward by SDF distance: t += SDF
- Update point position: point = origin + t × direction
- Terminate: When all rays are inactive or max iterations reached
Mathematical Foundation: The SDF property guarantees that for any point x, the distance to the nearest surface is exactly |f(x)|. This means we can safely step along the ray by this distance without crossing the surface, ensuring convergence to the intersection point.
6. Optimizing a Neural SDF (15 points)¶
| Input PointCloud | Epoch = 5000 | Epoch = 10000 | Epoch = 15000 | Epoch = 20000 | Epoch = 25000 |
|---|---|---|---|---|---|
MLP Architecture:
The neural SDF uses a 6-layer MLP with 128 hidden neurons per layer. Input 3D coordinates are encoded using harmonic embedding (4 frequencies, 27D output). The network outputs a single distance value through a final linear layer. No skip connections are used in the distance prediction branch.
Eikonal Loss:
Implements the constraint ||∇f(x)|| = 1 using autograd.grad() to compute gradients and penalizing deviations from unit magnitude: E[(||∇f(x)|| - 1)²]. The loss weight is set to 0.02, balanced with the main reconstruction loss.
7. VolSDF (15 points)¶
| Alpha | Beta | Geometry | Color |
|---|---|---|---|
| 10.0 (default) | 0.05 (default) | ||
| 1.0 | 0.05 | ||
| 100.0 | 0.05 | ||
| 10.0 | 0.1 | ||
| 10.0 | 0.5 |
Intuitive Explanation of Alpha and Beta Parameters:
In VolSDF, the SDF-to-density conversion relies on two critical parameters:
- Alpha (α): Acts as a scaling factor for density intensity. Think of it as controlling how "solid" the surface appears - higher values make surfaces more opaque and prominent.
- Beta (β): Determines the transition width around the surface. It's like controlling how "fuzzy" or "sharp" the boundary appears - smaller values create crisp edges, larger values create soft, gradual transitions.
Answers to the Three Questions:
1. How does high beta bias your learned SDF? What about low beta?
High beta produces a "blurry" density field that:
- Creates soft, rounded surface boundaries
- Results in a more volumetric, cloud-like appearance
- Makes the surface less precisely defined
Low beta creates a "sharp" density field that:
- Forces the network to learn exact surface locations
- Produces crisp, well-defined boundaries
- Encourages binary inside/outside classification
2. Would an SDF be easier to train with volume rendering and low beta or high beta? Why?
Training is definitely easier with higher beta values because:
- The smooth density gradients provide consistent learning signals across the entire volume
- The loss function becomes more well-behaved with fewer sharp discontinuities
- Gradient-based optimization works more effectively with smoother landscapes
Low beta makes training challenging because:
- The density function becomes nearly flat everywhere except at the surface
- This creates the classic vanishing gradient problem
- The optimization can get stuck in poor local minima due to the rugged loss landscape
3. Would you be more likely to learn an accurate surface with high beta or low beta? Why?
Lower beta (but not extremely low) typically yields more accurate surfaces because:
- It forces the network to be precise about surface location
- The sharp density transition provides stronger supervision signals
- Fine geometric details are better preserved in the final result
However, there's a sweet spot - too low and training becomes unstable, too high and surfaces become overly smooth.
Hyperparameter Choices and Results:
Best result: α=10.0 and β=0.05
This configuration works well because:
- α=10.0: Gives good visibility without making the surface too dense or transparent
- β=0.05: Provides a good balance - sharp enough for detail preservation but not so sharp that training becomes unstable
- The combination allows for stable convergence while maintaining surface fidelity
8. Neural Surface Extras (CHOOSE ONE! More than one is extra credit)¶
8.1. Render a Large Scene with Sphere Tracing (10 points)¶
Complex Scene Implementation:
I created a complex scene with 20 primitives using sphere tracing:
- 8 spheres positioned at various locations around the scene
- 6 boxes arranged in different orientations and positions
- 6 tori distributed throughout the 3D space
Technical Details:
- Implemented
ComplexSceneSDFclass that combines multiple primitives using SDF union operations - Used
torch.minimum()to compute the union of all primitive distances (minimum distance to any surface) - Scene spans a larger volume (far=8.0) to accommodate all primitives
- Rainbow coloring shows the spatial distribution of different objects
- Sphere tracing efficiently handles the complex geometry with 64 max iterations
The scene demonstrates the power of sphere tracing for rendering complex scenes with multiple objects efficiently, as it only needs to evaluate the SDF at each step rather than performing expensive ray-primitive intersection tests.
8.2 Fewer Training Views (10 points)¶
| Training Views | Vol SDF (Geometric) | Vol SDF (Color) | NeRF |
|---|---|---|---|
| 10 | |||
| 15 | |||
| 50 | |||
| All views |
Comparing VolSDF and NeRF with Fewer Training Views¶
When the number of training views is small, both NeRF and VolSDF face challenges in reconstructing accurate geometry and rendering realistic novel views. However, their behaviors differ due to their underlying representations:
- Representation Difference
NeRF models scene radiance and density directly using a neural network, learning implicit volumetric features without an explicit surface constraint.
VolSDF, on the other hand, represents geometry using a signed distance function (SDF) and models volume rendering through the SDF field, encouraging the network to learn a consistent surface even from limited views.
- Performance under Sparse Views
NeRF tends to overfit visible regions and hallucinate geometry in unseen areas because it lacks a surface prior. The severe degradation in performance, as shown in the images above, supports this observation as the number of views decreases from 100 to 5.
VolSDF generally maintains better geometric consistency and reduces floaters or artifacts due to the SDF regularization and the smooth surface prior.
8.3 Alternate SDF to Density Conversions (10 points)¶
Default parameters for SDF → density conversions¶
- VolSDF:
sdf_to_density(signed_distance, alpha=10.0, beta=0.05) - NeuS:
sdf_to_density_neus(signed_distance, s=15.0) - Step:
sdf_to_density_step(signed_distance, threshold=0.0, inside_density=1.0, outside_density=0.0) - Sigmoid:
sdf_to_density_sigmoid(signed_distance, scale=10.0, offset=0.0) - Linear:
sdf_to_density_linear(signed_distance, max_distance=1.0) - Exponential Falloff:
sdf_to_density_exponential_falloff(signed_distance, decay_rate=1.0)
| Type | VolSDF | Neus | Step | Sigmoid | Linear | Exponential Falloff |
|---|---|---|---|---|---|---|
| Geometry | ||||||
| Color |
Comparison: Sigmoid SDF vs VolSDF¶
| Type | s=15 | s=30 | s=50 | s=100 |
|---|---|---|---|---|
| Geometry | ||||
| Color |
Key Differences¶
Characteristics:
- VolSDF: Asymmetric inside/outside behavior.
- NeuS: Symmetric around the zero level-set.
Results:
- VolSDF produced smoother, more stable results.
- NeuS created sharper surfaces with high $s$ values but was less stable.
- NeuS was highly sensitive to the choice of $s$.
Trade-offs:
- Higher $s$ in NeuS → Sharper surfaces but increased training instability.
- VolSDF offered better control through separate $\alpha$ and $\beta$ parameters.
Conclusion¶
The NeuS approach is mathematically elegant (being the derivative of the sigmoid function) but requires careful parameter tuning, while VolSDF proved more robust in practice.