Question 0: Transmittance Calculation¶

Answers:
- $T(y1, y2)$ = $e^{-2}$
- $T(y2, y4)$ = $e^{-30}$
- $T(x, y4)$ = $e^{-32.5}$
- $T(x, y3)$ = $e^{-2.5}$
Question 1: Differentiable Volume Rendering¶
1.3: Ray Sampling

1.4: Point Sampling

1.5: Volume Rendering

Question 2: Optimizing a basic implicit volume¶
2.1: Random ray sampling, Loss and training, Visualization

- Box center: (0.25, 0.25, 0.00)
- Box side lengths: (2.01, 1.50, 1.50)
Similar color scheme, but as can be seen visually and from the dimensions of the box, the trained implicit network outputs a cuboid rather than a cube
Question 3: Optimizing a Neural Radiance Field¶
Low resolution, lego rendering without view dependence.
Correct model output:

One interesting thing to note is the impact of using ReLU activation layers between the linear layers - there is a drastic improvement in model clarity when using ReLU across the almost 10 layer network.
Model output without ReLU activation layers:

Question 4: NeRF Extras¶
4.1: View Dependence
View Dependence NeRF on low res materials data:

In the view dependence model, we replaced a simple linear and sigmoid layer model to one the below, which first processes a hidden vector, concatenates it with the embedded direction features and passes it through a final linear and sigmoid layer:
# Model definition
self.feature_layer = nn.Linear(hidden_dim_xyz, hidden_dim_xyz)
self.view_dependent_layer = nn.Sequential(
nn.Linear(hidden_dim_xyz + embedding_dim_dir, hidden_dim_dir),
nn.ReLU(),
nn.Linear(hidden_dim_dir, 3),
nn.Sigmoid(),
)
# Forward pass
feature_color = self.feature_layer(hidden_output)
directions_concat = torch.concat([feature_color, directions], dim=-1)
color_output = self.view_dependent_layer(directions_concat)
View dependence means that the output color at a given point along the ray is dependent on both the 3D location and the viewing direction.
We now account for the direction of the ray, so there are numerous key benefits with image generation, including generating more realistic 3D representations, especially when there is light being reflected off the surface or is there are numerous sources of light. This is beneficial if we need high resolution outputs.
However, we are now dealing with more parameters, which increases the complexity of the network. We also risk overfitting the model to lightning conditions, which is why disentagle the density and color prediction.
Question 5: Sphere Tracing¶

Description of implementation:
Begin by defining the points as equal to the origin and the points_mask as equal to a vector of zeros of size N x 1. Then perform the following operation self.max_iter number of times: pass the current points through the implicit_fn to get the magnitude of the SDF at that point, then check if the magnitude of the SDF is lower than some threshold (which means that the ray hit the surface). Update the points and points_mask accordingly.
## Sphere Tracing Code
points = origins
points_mask = torch.zeros(points.shape[0]).to(origins.device)
threshold = 1e-5
for i in range(self.max_iters):
# Predict magnitude of points using implicit_fn
mag = implicit_fn(points)
# Calculate mask where the ray hit the surface
points_mask = mag <= threshold
# Iterative update
points = points + mag * directions
return points, points_mask
Question 6: Optimizing a Neural SDF¶
Input point cloud used for training:

Prediction:

The eikonal loss:
The Eikonal loss requires that the norm of f(x) should be 1, thus, it minimizes the square difference between the norm of f(x) and 1. The Eikonal loss function in losses.py produces a Bx1 tensor and then takes the mean to produce one output loss value.
The MLP used:
In terms of the MLP, I use the same structure as Q4 - 2 sets of sequential linear layers with ReLU activation functions, connected by a concatenation operation. The distance value can lie in the range of the real numbers, and can take on negative values. So, I remove the sigmoid function used in the density prediction network in Q4.
Question 7: VolSDF¶
Parameters alpha and beta:
The density function models a smooth homogenous function, and the beta parameter is in charge of controlling how smooth the transition in densities is between the inside and outside the boundary. Alpha is a scaling parameter that proportionally impacts the density of the output. A greater alpha would increase the number of particles within a volume, effectively reducing the distance that the light ray can travel.
Question 1:
A high beta means a smooth transition, or the assumption that density slowly decreases as you move from inside to outside the object. It biases the model to learn a blurrier transition boundary. Whilst a smaller beta would encourage a clear, sharp boundary between the object and its exterior - it is a much sharper delineator.
Question 2:
A larger beta would lead to smoother gradients over the volume, which means less training instability and greater ease of training. A lower beta would mean much sharper gradient values in a smaller range, which could lead to instability in the network, especially early in the training process.
Question 3:
You are more likely to learn an accurate surface representation with a lower beta because the model has to learn a sharp decision boundary between a 3D solid object and the surrounding 'air particles'. This is more realistic representation of how 3D objects interact with their surroundings.
Hyperparameter experimentation:
beta = 0.01, 0.05, 0.07
alpha = 10, 25
Alpha 10, Beta 0.01:

Alpha 10, Beta 0.05:

Alpha 10, Beta 0.07:

Alpha 25, Beta 0.01:

Alpha 25, Beta 0.05:

Alpha 25, Beta 0.07:

Conclusions: We can clearly see that as we increase the beta values for both alpha values (10, 25) the output geometry is much smoother. However, for a model trained for the same number of epochs, we notice that as the beta values increases, the clarity of the generated viewpoints decreases. We also notice, that a greater density leads to less generated 'noise' or redundant points in the output geometry.
Best hyperparameters:
alpha = 25, beta = 0.01
Question 8: Neural Surface Extras¶
8.1: Render a Large Scene with Sphere Tracing
In Implicit.py, created a new class called TwentyTorusSDF which randomly creates 20 torus centers with random radii sizes and then performs ray tracing on that.

for i in range(num_torus):
# Create random radii
big_r = np.random.uniform(0.4, 0.9)
small_r = np.random.uniform(0.15, 0.3)
# Create random center
x = np.random.uniform(-5, 5)
y = np.random.uniform(-5, 5)
z = np.random.uniform(-1, 1)
8.2: Fewer Training Views
Fixed hyperparameter settings: alpha = 10, beta = 0.05, epochs = 30 Number of training samples: 100 (original), 20, 5
VolSDF
Number of samples = 100

Number of samples = 20

Number of samples = 5

NeRF
Number of samples = 100

Number of samples = 20

Number of samples = 5

8.3: Alternate SDF to Density Conversions
Fixed hyperparameter settings: alpha = 10, beta = 0.05, epochs = 20, training samples = 100
VolSDF Equation:

Naive Equation:
s = 20

s = 40

s = 60

s = 80

Conclusion: We can see that increasing the value of s sharpens the quality of the output from values s = 20 to s = 60. However, when the value crosses s = 80, though the geometry is most like that of the VolSDF, we see artifacts in the generated images.
## Code Implementation of the Naive Solution
def sdf_to_density_83(signed_distance, s):
# TODO (Q8.3): Extra credit, naive sdf-to-density function with varying s values
# formula: a * ...
first_term = s * torch.exp(-1 * s * signed_distance)
second_term = (1 + torch.exp(-1 * s * signed_distance)) ** 2
output = first_term + second_term
return output