My solution:
Implementation:
get_pixels_from_image generates a grid of pixel coordinates normalized to [-1, 1] in both axes, matching the TA's visual output.
get_rays_from_pixels maps these coordinates to points on the image plane at Z=1, then unprojects them to world space using the camera. Ray origins are set to the camera center, and directions are normalized from origin to world-space points.
The resulting rays match the TA's visualizations for both grid and ray outputs.
Implementation:
The StratifiedSampler forward pass samples points along each ray between near and far planes, randomly perturbing each interval for smoother gradients. Stratified sampling ensures that points are distributed with randomness, improving training stability and rendering quality.
Implementation:
In VolumeRenderer, _compute_weights calculates weights as: wi = Ti * (1 - exp(-sigmai * delta_ti)) for each sample, where Ti is transmittance, sigmai is density, and delta_ti is interval length.
_aggregate accumulates color and depth along the ray using these weights. The forward method combines these steps for differentiable rendering.
Differentiable rendering allows inverse optimization by propagating gradients from rendered data back to scene parameters.
In the VolSDF formulation, the parameters α and β control the shape of the Laplace CDF-based opacity function that maps signed distances to volume densities. A larger α produces sharper surfaces by steepening the SDF-to-density transition, while a smaller α yields smoother, more diffuse boundaries. The β parameter acts as a scale factor, adjusting the global density magnitude and influencing transparency and rendering stability.
Implemented Large Scene reconstruction with multiple spheres of a given radius clustered in 3D space.