
Here are the outputs of ray/grid visualization

The point samples from the first camera are:

Here is a visualization of the part 1 image gif and depth map

This is implemented in get_random_pixels_from_image() in ray_utils.py
After optimizing, the box center and side lengths are Box center: (0.25, 0.25, -0.00) Box side lengths: (2.00, 1.50, 1.50)
Below is the visualization of the optimized volume

Results:

Adding view dependence allows the NeRF model to capture realistic lighting effects, including specular highlights, under-exposure, and reflections. It causes the emitted color to be a function of both the position and viewing direction of the camera. Overall, adding view dependence increases the realism in the resulting NeRF scene. However, it also raises the risk of overfitting, as the model may memorize viewpoint-specific appearances rather than learning true scene geometry or reflectance. Additionally, the number of parameters in our model is increasing, so it is more expensive to add a view-dependence functionality to our model.
Adding view dependence on the nerf materials high res scene:

To render the torus, we have to first implement sphere tracing to compute where the ray and surface intersect. I implemented sphere tracing to solve this. Each ray marches forward by the SDF value at its current point, which ensures that it does not go into the surface. When the SDF value falls below a threshold, we consider that ray to have hit the surface. If a ray exceeds the maximum distance/iteration, then we consider it to be a miss.

The Neural Surface MLP learns a continuous SDF, and it's able to map a 3D point to a scalar value, which represents its distance from the nearest surface. Negative values represent points that are inside the surface, and positive values are points outside of the surface. The input is first encoded using harmonic embeddings. These embeddings are passed through a series of fully connected layers with ReLU activations, and skip connections are added at specific layers to preserve. The final linear layer outputs a single scalar SDF value per point. We apply Eikonal loss to ensure that the SDF behaves like an actual distance field. It enforces the gradient norm of the predicted SDF wrt the input coordinates to be close to 1.

The parameters alpha and beta control how the SDF is converted into a volumetric density field. Alpha acts as a global scaling factor, which tells us how opaque the overall surface becomes. The beta parameter controls the sharpness of the transition from empty space to solid surface.
A low beta makes the density change rapidly around the zero level set (the surface), creating a very thin/sharp boundary. A high beta makes this transition smoother, which makes the surface look blurrier.
When beta is high, the learned SDF tends to be smoother, meaning the model doesn’t need to represent sharp changes and thus may have more stable training. When beta is low, the model must learn steeper gradients near the surface, which makes the SDF more precise, but it becomes harder to train because the gradients may cause training to be unstable. Therefore, an SDF is generally easier to train with a higher beta. However, the lower beta gives a more sharp, precise surface.
For the hyperparameters, I increased the alpha to 12.0 and decreased the beta to 0.35. This will help the result have a more sharp,precise surface. Additionally, I increased the number of epochs to 300 and decreased the learning rate to 0.0003 for more stable training.

I implemented the conversion that was described in the NeuS paper. They use an S-density function, which is a bell-shaped function derived from the derivative of a sigmoid. The input to this is the SDF, and it outputs densities for all the input distances. The function peaks at the zero-level set of the SDF to indicate where the surface is, and it decays smoothly as it goes farther away.
The training parameters were kept the same. As shown, the naive solution's results are a bit more noisier. There are more artifacts on the outside of the surface in certain viewpoints. Here is the geometry and 3D reconstruction visualization of using the naive solution from the NeuS paper:
