16-825 Assignment 3

Duc Doan

Q0. Transmittance Calculation

fig1

solution

Q1. Differentiable Volume Rendering

1.3. Ray sampling

vis_grid vis_rays

1.4. Point sampling

1.4

1.5. Volume rendering

part1 gif 1.5. depth

Q2. Optimizing a basic implicit volume

q2 gif

Optimized box:

center: (0.25, 0.25, 0.00)
side 1 length: 2.00
side 2 length: 1.50
side 3 length: 1.50

Q3. Neural Radiance Field

q3 gif

Q4. NeRF extras - View Dependence

Materials	Materials highres

Trade-offs between increased view dependence and generalization quality:

view-dependence makes the model able to produce more realistic lighting effects
generalization means the model is able to render the scene correctly from unseen viewpoints
if the model overfits to some certain viewpoints, it tries to match the lighting effects in those images too much, making it fail to learn the actual 3D scene structure and base colors
if there is no view-dependent information, the model focuses on learning the 3D structure, but its rendering quality will be less realistic

Q5. Sphere tracing

Implementation: the core logic is simply to walk along the ray directions with an amount equal to the SDF distance. The points are initialized at the near plane. The mask is computed by checking if the distance is less than a threshold (1e-5). If the mask value is True for a point, it is no longer updated. After max_iters, the mask is AND-ed with another mask that checks if the points are within the far plane.

Q6. Optimizing a Neural SDF

Input	Output

MLP architecture: very similar to NeRF

a MLP with no input skips that processes xyz inputs
a distance head (Linear layer) that predicts the distance and a latent vector for further color processing (to be used in Q7)

Eikonal loss: defined following [1]

eikonal

Hyperparameters: default values in points_surface.yaml

Q7. VolSDF

VolSDF trained with default hyperparameters:

q7 geometry

Architecture: similar to NeRF

an additional color MLP is added after the distance head in Q6
the color head's output is a Sigmoid layer

SDF to density using the Cumulative Distribution Function (CDF) of the Laplace distribution:

The CDF has range (0, 1), so alpha basically defines the density of the object
Ideally, if the SDF distance is less than or equal to 0, we want the density to be alpha and 0 other wise. However, the CDF is used to smoothen this step function for easier optimization.
Higher beta makes this transition smoother, which kind of makes the learned SDF inflated with respect to the actual surface. Lower beta, in contrast, makes the transition sharper, which makes the learned SDF closer to the actual surface.
I think the SDF is easier to train with high beta because of the smoother transition function. If the transition is too sharp, the renderer is less likely to find points on the surface during the initial training steps, making the optimization unstable.
Low beta should make the model more likely to learn an accurate surface because of less "inflation".

I think my chosen alpha and beta values had a good balance. With smaller beta, I see a lot of holes in the geometry because of the difficulty in optimization. With larger beta, the small details like the wheels at the back or the teeth disappear.

Q8. Neural surface extras - fewer training views

Both VolSDF and NeRF trained with the same 10 views:

VolSDF	NeRF

The NeRF rendering shows some artifacts for a few frames while VolSDF's rendering does not. I think NeRF might have overfitted to these small number of views, making it unable to generalize well to unseen views. VolSDF, as expected, handles this few-view case quite well because it actually learns the 3D structure as compared to view-dependent rendering of NeRF.