Assignment 3¶

Author: Yu Jin Goh (yujing)

A0 Transmittance Calculation¶

The Transmittance in this case consists of Absorption only.

Transmittance

T(y_1, y_2)

$= e^{-\int_{y_1}^{y_2} \sigma(x_0 + \omega t) dt}$
$= e^{-([\sigma(x_0 + \omega t)t]_{y1}^{y2})}$
$= e^{-(1*2)}$
$= e^{-2}$
$= 1.353 \times 10^{-1}$

T(y_2, y_4)

$= e^{-\int_{y_2}^{y_4} \sigma(x_0 + \omega t) dt}$
$= e^{-([\sigma(x_0 + \omega t)t]_{y2}^{y4})}$
$= e^{-([\sigma(x_0 + \omega t)t]_{y2}^{y3} + [\sigma(x_0 + \omega t)t]_{y3}^{y4})}$
$= e^{-(0.5*1 + 10*3)}$
$= e^{-30.5}$
$= 5.676 \times 10^{-14}$

T(x, y_4)

$= e^{-\int_{x}^{y_4} \sigma(x_0 + \omega t) dt}$
$= e^{-([\sigma(x_0 + \omega t)t]_{x}^{y_4})}$
$= e^{-([\sigma(x_0 + \omega t)t]_{x}^{y_1} + [\sigma(x_0 + \omega t)t]_{y_1}^{y_2} + [\sigma(x_0 + \omega t)t]_{y2}^{y4})}$
$= e^{-(0 + 1*2 + 0.5*1 + 10*3)}$
$= e^{-32.5}$
$= 7.681 \times 10^{-15}$

T(x, y_3)

$= e^{-\int_{x}^{y_3} \sigma(x_0 + \omega t) dt}$
$= e^{-([\sigma(x_0 + \omega t)t]_{x}^{y3})}$
$= e^{-([\sigma(x_0 + \omega t)t]_{x}^{y1} + [\sigma(x_0 + \omega t)t]_{y1}^{y2} + [\sigma(x_0 + \omega t)t]_{y2}^{y3})}$
$= e^{-(0 + 1*2 + 0.5*1)}$
$= e^{-2.5}$
$= 8.208 \times 10^{-2}$

A1 Differentiable Volume Rendering¶

A1.3 Ray Sampling¶

XY and Rays:
rays

A1.4 Point Sampling¶

Ray point samples:
ray_samples

A1.5 Volume Rendering¶

Cube volume render and depth:
cube depth

A.2 Optimizing a basic implicit volume¶

A2.2 Loss and Training:¶

Center of the box: (0.25, 0.25, 0.00)
Side lengths of the box: (2.01, 1.50, 1.50)
(Rounded to 2 decimal places)

A2.3 Visualization¶

Compared gif against TA provided image, it's the same.
part2_cube

A3 Optimizing a Neural Radiance Field¶

NeRF

A4 NeRF Extras¶

A4.1 View Dependence¶

Discussion on View Dependence vs Generalization:
Increased view dependence would likely cause the model to fit the training dataset better, but generate poorer novel views.

The NeRF approach (without view dependence) generally assumes that the volume is filled with mainly diffuse materials and thus color and opacity can be best used to generalize across views at every position. This is great for novel view synthesis if the scene is purely diffuse materials only.

Unfortunately, scenes usually also contain view dependent effects which cannot be modelled with these assumptions. Hence, view dependence is used to help the model learn view dependent effects. Thus, more view dependence would indicate that the parameters learned would be more optimal for each training view.

However, the downside is that less emphasis is placed on learning information that can generalize in novel views since the model could better fit the training data by relying more on view dependence.

Materials

A5 Sphere Tracing¶

Sphere traced torus:
Torus

Short Writeup:
I first initialize all points at the origin.
For every point, I query the SDF function for the distance away to the closest surface. This distance represents the radius around the point in which there is only free space.
I then compute the new point position by multiplying the distance obtained against the direction and add it to the current point position. This moves the position of the point in the direction of the ray.
I repeat the distance computation and position update for up to 50 iterations and monitor the distance provided by the SDF.
After the 50 iterations, any point with a distance below an epsilon of 1e-5 will be considered a surface and those greater than epsilon will not be a surface.

A6 Optimizaing a Neural SDF¶

Input point cloud and optimized SDF:
Bunny Bunny

Brief Description of MLP and eikonal loss:
The MLP i used was similar to the one written for my NeRF but adapted for an SDF learning objective:

I embed the queried point's xyz with a Positional Embedding by computing sine and cosine features.
The Positional features were fed into a 8 layer MLP of 256 dimensions per layer with a ReLU activation function applied after each layer. The Positional features were also concatenated in the middle to the 4th layer.
The output of this layer is fed into a linear layer which reduced the dimension from 256 to 1 to compute distance.
We then have 2 more linear layers which reduce the dimension from 256 -> 128 -> 3 to regress color. A ReLU layer was used from the intermediate layer and a sigmoid was applied at the end to ensure colors ranged between 0 to 1.

The Eikonal loss computes the gradient of the model output with respect to randomly sampled points and constrains its norm to be 1. This ensures that the learned function behaves close to a signed distance field, where the rate of change of the distance is 1 throughout the volume.

A7 VolSDF¶

The parameter alpha is a scalar value that scales the density values. Increasing alpha leads to a linear increase in the density across the entire function, thus creating a sharper occupancy boundary. The parameter beta is a scalar value that determines how quickly the exponential function will decay, thus leading to a nonlinear change. A lower beta value leads to a sharper density function, leading to sharper surfaces.

Questions and Answers:

How does high beta bias your learned SDF? What about low beta?
A high beta leads to a more spread out density function thus the rate of change of the density would be lower and the object would look blurrer. A lower beta would lead to sharper changes due to a spike in the density function. This indicates that the object's edges would be sharper.
Would an SDF be easier to train with volume rendering and low beta or high beta? Why?
An SDF would be easier to train with a high beta since the change in opacity won't be as rapid and the function would be more continuous and differentiable.
Would you be more likely to learn an accurate surface with high beta or low beta? Why?
A more accurate surface could be learnt wit a low beta as this would indicate a sharper and more defined surface. Thus the object will have more defined edges.

SDF_NeRF SDF_NeRF_Geom

Hyperparameters changed:
I increased the batch size to 2048 to sample more rays and make the loss curve smoother and less volatile.
I also increased the number of epochs to 10000 from 5000 and lr_scheduler_step_size to 100 from 50. This is to have the model maintain a larger learning rate for longer to optmize the global details, before finetuning the details with a smaller learning rate.
I also increased alpha from 10 to 20 in order to magnify the density values faster to sharpen the occupancy boundaries.

A8 Neural Surface Extras¶

A8.1 Render a Large Scene with Sphere Tracing¶

Trous_chain

python -m surface_rendering_main --config-name=torus_chain

A8.2 Fewer Training Views¶

Trained on 25 images instead of 100 images of NeRF Lego:
SDF_NeRF SDF_NeRF_Geom

NeRF solution for comparision:
SDF_NeRF

We can see that the NeRF solution has a lot more black floaters compared to SDFs when trained on limited views. This is because for novel views, there may not be ray samples for the NeRF in the training set. Thus, the opacity and color at those locations were not optimized due to low camera coverage. In comparision, such artifacts are less prevalent in the Neural SDF scene. This is because the SDF has a strong implicit bias which enforces that surfaces only lie on the zero level set.