HW3

guyingl

A. Neural Volume Rendering

0. Transmittance Calculation (10 points)

1. Differentiable Volume Rendering

1.3. Ray sampling (5 points)

Left: grid. Right: rays.

1.4. Point sampling (5 points)

1.5. Volume rendering (20 points)

2. Optimizing a basic implicit volume

2.2. Loss and training (5 points)

Box center: (0.25, 0.25, -0.00) Box side lengths: (2.00, 1.50, 1.50)

2.3. Visualization

3. Optimizing a Neural Radiance Field (NeRF) (20 points)

4. NeRF Extras (CHOOSE ONE! More than one is extra credit)

4.1 View Dependence (10 points)

python volume_rendering_main.py –config-name=nerf_materials

We extend NeRF to include view-dependent appearance. The network first predicts density from shared layers using only spatial coordinates, and then combines view directions with intermediate features in the final layers to produce color. The view direction should only influence a few layers to model subtle effects like specular highlights, without changing geometry. If the network relied only on view direction and ignored spatial input, it might still reproduce the training views but fail to generalize to nearby viewpoints. Thus, carefully introducing such structure helps improve both training stability and generalization.

Left: view-dependent; Right: view-independent

B. Neural Surface Rendering (50 points)

5. Sphere Tracing (10 points)

  def sphere_tracing(
        self,
        implicit_fn,
        origins, # Nx3
        directions, # Nx3
    ):
        
        num_rays = origins.shape[0]
        device = origins.device

        points = origins.clone()
        distances = torch.full((num_rays, 1), self.near, device=device, dtype=torch.float32)
        mask = torch.zeros((num_rays, 1), dtype=torch.bool, device=device)

        for _ in range(self.max_iters):
            sdf = implicit_fn(points).squeeze(1)  # (N,)
            surface_hit = (sdf.abs() < 1e-4)
            mask[:, 0] |= surface_hit

            continue_tracing = (sdf > 1e-4) & (distances.squeeze(1) < self.far)
            if not continue_tracing.any():
                break

            points[continue_tracing] += directions[continue_tracing] * sdf[continue_tracing, None]
            distances[continue_tracing] += sdf[continue_tracing, None]

        return points, mask

Starting from the ray origins, it repeatedly moves each point along its ray direction by the current SDF value, which estimates the distance to the nearest surface. During each iteration, it checks whether a point is close enough to the surface (SDF ≈ 0) to be considered a hit and updates a mask to mark successful intersections. The process continues until all rays either hit the surface or exceed the maximum tracing distance. The function finally returns the estimated intersection points and a boolean mask indicating which rays hit the surface.

The rendered result.

6. Optimizing a Neural SDF (15 points)

The rendered result of the optimized neural SDF.

7. VolSDF (15 points)

I use alpha: 10.0 and beta: 0.01 as its the best combination from my experiments. The rendered result is attached below.

In the VolSDF formulation, α determines the overall magnitude of the predicted density, while β controls how abruptly the density changes around the surface boundary. When β approaches zero, the model essentially produces a step-like behavior — densities are close to α inside the object and almost zero outside, resulting in a sharp, well-defined surface.

How does high beta bias your learned SDF? What about low beta?

When β is large, the SDF-to-density mapping becomes smoother. This tends to produce softer boundaries and thicker geometry since the density decays gradually rather than sharply.

When β is small, the transition becomes much steeper, which allows the network to represent thin, crisp surfaces. However, it may also introduce instability or sensitivity to small errors due to the sharp gradient near the boundary.

Would an SDF be easier to train with volume rendering and low beta or high beta? Why?

From an optimization perspective, using a smaller β generally helps the model learn the zero-level surface more effectively, since the gradient signal is concentrated around a narrow region close to the true surface.

Conversely, a larger β leads to smoother gradients but spreads the supervision over a wider area, making convergence slower and potentially leading to less precise surfaces.

Would you be more likely to learn an accurate surface with high beta or low beta? Why?

Sharper (low-β) transitions typically correspond to more accurate surface reconstruction, as the model can better localize the boundary.

8. Neural Surface Extras

8.3 Alternate SDF to Density Conversions

Here we explore two different strategies for mapping SDF values to volume densities — one used in VolSDF and the other in NeuS. The two methods behave quite differently: VolSDF produces crisper and more well-defined geometry, while NeuS tends to generate overly smooth results and, in some cases, even fails to produce a valid mesh due to low density contrast. I believe NeuS could perform comparably with a more extensive hyperparameter tuning process.