16-825 Assignment 3
Duc Doan
Q0. Transmittance Calculation


Q1. Differentiable Volume Rendering
1.3. Ray sampling

1.4. Point sampling

1.5. Volume rendering

Q2. Optimizing a basic implicit volume

Optimized box:
- center: (0.25, 0.25, 0.00)
- side 1 length: 2.00
- side 2 length: 1.50
- side 3 length: 1.50
Q3. Neural Radiance Field

| Materials |
Materials highres |
 |
 |
Trade-offs between increased view dependence and generalization quality:
- view-dependence makes the model able to produce more realistic lighting effects
- generalization means the model is able to render the scene correctly from unseen viewpoints
- if the model overfits to some certain viewpoints, it tries to match the lighting effects in those images too much, making it fail to learn the actual 3D scene structure and base colors
- if there is no view-dependent information, the model focuses on learning the 3D structure, but its rendering quality will be less realistic
Q5. Sphere tracing

Implementation: the core logic is simply to walk along the ray directions with an amount equal to the SDF distance. The points are initialized at the near plane. The mask is computed by checking if the distance is less than a threshold (1e-5). If the mask value is True for a point, it is no longer updated. After max_iters, the mask is AND-ed with another mask that checks if the points are within the far plane.
Q6. Optimizing a Neural SDF
| Input |
Output |
 |
 |
MLP architecture: very similar to NeRF
- a MLP with no input skips that processes xyz inputs
- a distance head (Linear layer) that predicts the distance and a latent vector for further color processing (to be used in Q7)
Eikonal loss: defined following [1]

Hyperparameters: default values in points_surface.yaml
Q7. VolSDF
VolSDF trained with default hyperparameters:

Architecture: similar to NeRF
- an additional color MLP is added after the distance head in Q6
- the color head's output is a Sigmoid layer
SDF to density using the Cumulative Distribution Function (CDF) of the Laplace distribution:
- The CDF has range (0, 1), so alpha basically defines the density of the object
- Ideally, if the SDF distance is less than or equal to 0, we want the density to be alpha and 0 other wise. However, the CDF is used to smoothen this step function for easier optimization.
- Higher beta makes this transition smoother, which kind of makes the learned SDF inflated with respect to the actual surface. Lower beta, in contrast, makes the transition sharper, which makes the learned SDF closer to the actual surface.
- I think the SDF is easier to train with high beta because of the smoother transition function. If the transition is too sharp, the renderer is less likely to find points on the surface during the initial training steps, making the optimization unstable.
- Low beta should make the model more likely to learn an accurate surface because of less "inflation".
I think my chosen alpha and beta values had a good balance. With smaller beta, I see a lot of holes in the geometry because of the difficulty in optimization. With larger beta, the small details like the wheels at the back or the teeth disappear.
Both VolSDF and NeRF trained with the same 10 views:
| VolSDF |
NeRF |
 |
 |
The NeRF rendering shows some artifacts for a few frames while VolSDF's rendering does not. I think NeRF might have overfitted to these small number of views, making it unable to generalize well to unseen views. VolSDF, as expected, handles this few-view case quite well because it actually learns the 3D structure as compared to view-dependent rendering of NeRF.