Assignment 3 : Neural Volume Rendering and Surface Rendering¶

Submitted by: Changwei Yao¶

Andrew ID: changwey¶

Date: Oct 19, 2025¶

A. Neural Volume Rendering (80 points)¶

0. Transmittance Calculation (10 points)¶

No description has been provided for this image

1. Differentiable Volume Rendering¶

1.3. Ray sampling (5 points)¶

Grid Visualization	Ray Visualization

1.4. Point sampling (5 points)¶

1.5. Volume rendering (20 points)¶

Color Visualization	Depth Visualization

2. Optimizing a basic implicit volume¶

2.1. Random ray sampling (5 points)¶

# Random subsampling of pixels from an image
def get_random_pixels_from_image(n_pixels, image_size, camera):
    xy_grid = get_pixels_from_image(image_size, camera)
    
    # TODO (Q2.1): Random subsampling of pixel coordinaters
    xy_grid_sub = xy_grid.unsqueeze(1)
    xy_grid_sub = xy_grid_sub.index_select(0, torch.randperm(xy_grid_sub.shape[0]))

    # Return
    return xy_grid_sub.reshape(-1, 2)[:n_pixels]

2.2. Loss and training (5 points) & 2.3 Visualization¶

3. Optimizing a Neural Radiance Field (NeRF) (20 points)¶

Epochs=50	Epochs=100	Epochs=150	Epochs=200

4. NeRF Extras (CHOOSE ONE! More than one is extra credit)¶

4.1 View Dependence (10 points)¶

LOW RESOLUTION

Without view dependence (50 epcoch)	Without view dependence (100 epcoch)	Without view dependence (150 epcoch)	Without view dependence (200 epcoch)

With view dependence (50 epcoch)	With view dependence (100 epcoch)	With view dependence (150 epcoch)	With view dependence (200 epcoch)

HIGH RESOLUTION

Without view dependence (50 epcoch)	Without view dependence (100 epcoch)	Without view dependence (150 epcoch)	Without view dependence (200 epcoch)

With view dependence (50 epcoch)	With view dependence (100 epcoch)	With view dependence (150 epcoch)	With view dependence (200 epcoch)

B. Neural Surface Rendering (50 points)¶

5. Sphere Tracing (10 points)¶

Sphere Tracing Algorithm:

Sphere tracing is an iterative ray marching algorithm used to find ray-surface intersections for implicit surfaces defined by Signed Distance Functions (SDFs). The key insight is that the SDF value at any point gives us a safe step size - we can move along the ray by exactly the SDF distance without overshooting the surface.

Algorithm Steps:

Initialize: Start each ray at its origin point
Iterate: For each active ray:
- Query the SDF at the current point
- If |SDF| < ε (threshold), mark as hit and deactivate
- If t > far distance, mark as miss and deactivate
- Otherwise, step forward by SDF distance: t += SDF
- Update point position: point = origin + t × direction
Terminate: When all rays are inactive or max iterations reached

Mathematical Foundation: The SDF property guarantees that for any point x, the distance to the nearest surface is exactly |f(x)|. This means we can safely step along the ray by this distance without crossing the surface, ensuring convergence to the intersection point.

6. Optimizing a Neural SDF (15 points)¶

Input PointCloud	Epoch = 5000	Epoch = 10000	Epoch = 15000	Epoch = 20000	Epoch = 25000

MLP Architecture:

The neural SDF uses a 6-layer MLP with 128 hidden neurons per layer. Input 3D coordinates are encoded using harmonic embedding (4 frequencies, 27D output). The network outputs a single distance value through a final linear layer. No skip connections are used in the distance prediction branch.

Eikonal Loss:

Implements the constraint ||∇f(x)|| = 1 using autograd.grad() to compute gradients and penalizing deviations from unit magnitude: E[(||∇f(x)|| - 1)²]. The loss weight is set to 0.02, balanced with the main reconstruction loss.

7. VolSDF (15 points)¶

Alpha	Beta	Geometry	Color
10.0 (default)	0.05 (default)
1.0	0.05
100.0	0.05
10.0	0.1
10.0	0.5

Intuitive Explanation of Alpha and Beta Parameters:

In VolSDF, the SDF-to-density conversion relies on two critical parameters:

Alpha (α): Acts as a scaling factor for density intensity. Think of it as controlling how "solid" the surface appears - higher values make surfaces more opaque and prominent.
Beta (β): Determines the transition width around the surface. It's like controlling how "fuzzy" or "sharp" the boundary appears - smaller values create crisp edges, larger values create soft, gradual transitions.

Answers to the Three Questions:

1. How does high beta bias your learned SDF? What about low beta?

High beta produces a "blurry" density field that:
- Creates soft, rounded surface boundaries
- Results in a more volumetric, cloud-like appearance
- Makes the surface less precisely defined
Low beta creates a "sharp" density field that:
- Forces the network to learn exact surface locations
- Produces crisp, well-defined boundaries
- Encourages binary inside/outside classification

2. Would an SDF be easier to train with volume rendering and low beta or high beta? Why?

Training is definitely easier with higher beta values because:

The smooth density gradients provide consistent learning signals across the entire volume
The loss function becomes more well-behaved with fewer sharp discontinuities
Gradient-based optimization works more effectively with smoother landscapes

Low beta makes training challenging because:

The density function becomes nearly flat everywhere except at the surface
This creates the classic vanishing gradient problem
The optimization can get stuck in poor local minima due to the rugged loss landscape

3. Would you be more likely to learn an accurate surface with high beta or low beta? Why?

Lower beta (but not extremely low) typically yields more accurate surfaces because:

It forces the network to be precise about surface location
The sharp density transition provides stronger supervision signals
Fine geometric details are better preserved in the final result

However, there's a sweet spot - too low and training becomes unstable, too high and surfaces become overly smooth.

Hyperparameter Choices and Results:

Best result: α=10.0 and β=0.05

This configuration works well because:

α=10.0: Gives good visibility without making the surface too dense or transparent
β=0.05: Provides a good balance - sharp enough for detail preservation but not so sharp that training becomes unstable
The combination allows for stable convergence while maintaining surface fidelity

8. Neural Surface Extras (CHOOSE ONE! More than one is extra credit)¶

8.1. Render a Large Scene with Sphere Tracing (10 points)¶

Complex Scene Implementation:

I created a complex scene with 20 primitives using sphere tracing:

8 spheres positioned at various locations around the scene
6 boxes arranged in different orientations and positions
6 tori distributed throughout the 3D space

Technical Details:

Implemented ComplexSceneSDF class that combines multiple primitives using SDF union operations
Used torch.minimum() to compute the union of all primitive distances (minimum distance to any surface)
Scene spans a larger volume (far=8.0) to accommodate all primitives
Rainbow coloring shows the spatial distribution of different objects
Sphere tracing efficiently handles the complex geometry with 64 max iterations

The scene demonstrates the power of sphere tracing for rendering complex scenes with multiple objects efficiently, as it only needs to evaluate the SDF at each step rather than performing expensive ray-primitive intersection tests.

8.2 Fewer Training Views (10 points)¶

Training Views	Vol SDF (Geometric)	Vol SDF (Color)	NeRF
10
15
50
All views

Comparing VolSDF and NeRF with Fewer Training Views¶

When the number of training views is small, both NeRF and VolSDF face challenges in reconstructing accurate geometry and rendering realistic novel views. However, their behaviors differ due to their underlying representations:

Representation Difference

NeRF models scene radiance and density directly using a neural network, learning implicit volumetric features without an explicit surface constraint.

VolSDF, on the other hand, represents geometry using a signed distance function (SDF) and models volume rendering through the SDF field, encouraging the network to learn a consistent surface even from limited views.

Performance under Sparse Views

NeRF tends to overfit visible regions and hallucinate geometry in unseen areas because it lacks a surface prior. The severe degradation in performance, as shown in the images above, supports this observation as the number of views decreases from 100 to 5.

VolSDF generally maintains better geometric consistency and reduces floaters or artifacts due to the SDF regularization and the smooth surface prior.

8.3 Alternate SDF to Density Conversions (10 points)¶

Default parameters for SDF → density conversions¶

VolSDF: sdf_to_density(signed_distance, alpha=10.0, beta=0.05)
NeuS: sdf_to_density_neus(signed_distance, s=15.0)
Step: sdf_to_density_step(signed_distance, threshold=0.0, inside_density=1.0, outside_density=0.0)
Sigmoid: sdf_to_density_sigmoid(signed_distance, scale=10.0, offset=0.0)
Linear: sdf_to_density_linear(signed_distance, max_distance=1.0)
Exponential Falloff: sdf_to_density_exponential_falloff(signed_distance, decay_rate=1.0)

Type	VolSDF	Neus	Step	Sigmoid	Linear	Exponential Falloff
Geometry
Color

Comparison: Sigmoid SDF vs VolSDF¶

Type	s=15	s=30	s=50	s=100
Geometry
Color

Key Differences¶

Characteristics:
- VolSDF: Asymmetric inside/outside behavior.
- NeuS: Symmetric around the zero level-set.
Results:
- VolSDF produced smoother, more stable results.
- NeuS created sharper surfaces with high $s$ values but was less stable.
- NeuS was highly sensitive to the choice of $s$.
Trade-offs:
- Higher $s$ in NeuS → Sharper surfaces but increased training instability.
- VolSDF offered better control through separate $\alpha$ and $\beta$ parameters.

Conclusion¶

The NeuS approach is mathematically elegant (being the derivative of the sigmoid function) but requires careful parameter tuning, while VolSDF proved more robust in practice.