guyingl
Left: grid. Right: rays.
Box center: (0.25, 0.25, -0.00)
Box side lengths: (2.00, 1.50, 1.50)
python volume_rendering_main.py –config-name=nerf_materials
We extend NeRF to include view-dependent appearance. The network first predicts density from shared layers using only spatial coordinates, and then combines view directions with intermediate features in the final layers to produce color. The view direction should only influence a few layers to model subtle effects like specular highlights, without changing geometry. If the network relied only on view direction and ignored spatial input, it might still reproduce the training views but fail to generalize to nearby viewpoints. Thus, carefully introducing such structure helps improve both training stability and generalization.
Left: view-dependent; Right: view-independent
def sphere_tracing(
self,
implicit_fn,
origins, # Nx3
directions, # Nx3
):
num_rays = origins.shape[0]
device = origins.device
points = origins.clone()
distances = torch.full((num_rays, 1), self.near, device=device, dtype=torch.float32)
mask = torch.zeros((num_rays, 1), dtype=torch.bool, device=device)
for _ in range(self.max_iters):
sdf = implicit_fn(points).squeeze(1) # (N,)
surface_hit = (sdf.abs() < 1e-4)
mask[:, 0] |= surface_hit
continue_tracing = (sdf > 1e-4) & (distances.squeeze(1) < self.far)
if not continue_tracing.any():
break
points[continue_tracing] += directions[continue_tracing] * sdf[continue_tracing, None]
distances[continue_tracing] += sdf[continue_tracing, None]
return points, mask
Starting from the ray origins, it repeatedly moves each point along its ray direction by the current SDF value, which estimates the distance to the nearest surface. During each iteration, it checks whether a point is close enough to the surface (SDF ≈ 0) to be considered a hit and updates a mask to mark successful intersections. The process continues until all rays either hit the surface or exceed the maximum tracing distance. The function finally returns the estimated intersection points and a boolean mask indicating which rays hit the surface.
The rendered result.
The rendered result of the optimized neural SDF.
I use alpha: 10.0 and beta: 0.01 as its the
best combination from my experiments. The rendered result is attached
below.
In the VolSDF formulation, α determines the overall magnitude of the predicted density, while β controls how abruptly the density changes around the surface boundary. When β approaches zero, the model essentially produces a step-like behavior — densities are close to α inside the object and almost zero outside, resulting in a sharp, well-defined surface.
When β is large, the SDF-to-density mapping becomes smoother. This tends to produce softer boundaries and thicker geometry since the density decays gradually rather than sharply.
When β is small, the transition becomes much steeper, which allows the network to represent thin, crisp surfaces. However, it may also introduce instability or sensitivity to small errors due to the sharp gradient near the boundary.
From an optimization perspective, using a smaller β generally helps the model learn the zero-level surface more effectively, since the gradient signal is concentrated around a narrow region close to the true surface.
Conversely, a larger β leads to smoother gradients but spreads the supervision over a wider area, making convergence slower and potentially leading to less precise surfaces.
Sharper (low-β) transitions typically correspond to more accurate surface reconstruction, as the model can better localize the boundary.
Here we explore two different strategies for mapping SDF values to volume densities — one used in VolSDF and the other in NeuS. The two methods behave quite differently: VolSDF produces crisper and more well-defined geometry, while NeuS tends to generate overly smooth results and, in some cases, even fails to produce a valid mesh due to low density contrast. I believe NeuS could perform comparably with a more extensive hyperparameter tuning process.