Learning for 3D Vision: Assignment 4
- Name: Ahish Deshpande
- Andrew ID: ahishd
1. 3D Gaussian Splatting
1.1. Fitting a Voxel Grid
1.1.2 Fitting a Voxel Grid
All Unit tests pass: 4/4.
1.1.5. Perform Splatting
| Splatting Output |
| Still |
 |
| Output |
 |
1.2. Training 3D Gaussian Representations
1.2.2. Perform Forward Pass and Compute Loss
| Final Render and Training Progress |
| Final Render |
Training Progress |
 |
 |
Learning Rates Tried
| Opacities |
Scales |
Colours |
Means |
Comments |
Render |
Progress |
| 0.00005 |
0.0001 |
0.00001 |
0.000001 |
Model Underfitted |
 |
 |
| 0.001 |
0.03 |
0.05 |
0.0005 |
Model Overfitted |
 |
 |
| 0.0001 |
0.001 |
0.001 |
0.00002 |
Good representation |
 |
 |
Number of Iterations
The best performing render was trained over 2000 iterations.
PSNR and SSIM
PSNR: 23.445
SSIM: 0.860
1.3. Extensions
1.3.1. Rendering Using Spherical Harmonics
| Final Render (with View Dependence) |
Final Render (without View Dependence) |
 |
 |
| Frame No. |
View Dependent Splat |
View Independent Splat |
Difference Explanation |
| 3 |
 |
 |
This is because across the views, it appears that the lights are placed to the top left and bottom right facing the chair. In the view dependent rendering, we can see the shadows accordingly. |
| 13 |
 |
 |
This is because across the views, it appears that the lights are placed to the top left and bottom right facing the chair. In the view dependent rendering, we can see the shadows accordingly. |
| 31 |
 |
 |
This is because across the views, it appears that the lights are placed to the top left and bottom right facing the chair. In the view dependent rendering, we can see the shadows accordingly. |
2. Diffusion-guided Optimization
2.1. SDS Loss + Image Optimization
| Prompt |
Iterations |
Guidance = 0 |
Guidance = 1 |
| A Hamburger |
1600 |
 |
 |
| A Standing Corgi Dog |
1900 |
 |
 |
| A Mansion |
1600 |
 |
 |
| A DSLR |
1700 |
 |
 |
2.2. Texture Map Optimization for Mesh
| Final Renders: Textured Cow Mesh |
 |
 |
2.3. NeRF Optimization
| Prompt |
RGB Render (Video) |
Depth Map (Video) |
| A Standing Corgi Dog |
 |
 |
| A Squirrel in a Cello |
 |
 |
| A Baby Lion |
 |
 |
2.4. Extensions
2.4.1. Extensions (View Dependent Conditioning)
| Prompt |
RGB (View-Dependent) |
Depth (View-Dependent) |
Original RGB (2.3) |
| A Standing Corgi Dog |
 |
 |
 |
| A Squirrel in a Cello |
 |
 |
 |
| A Cute Panda |
 |
 |
 |
As is visible in the above comparisons, the view dependent conditioning is able to ensure a consistent output across views, avoiding the problem where multiple front facing views appear in the model. It does however take longer to train, and thus in a 100 iterations is not able to come up with a detailed rendering of the prompts. The above renderings do show that the addition of view dependent embeddings helps in ensuring consistency.