Assignment 4

1. 3D Gaussian Splatting

Question 1.1.5: Perform Splatting

3D Gaussian Splatting render

Question 1.2.2: Perform Forward Pass and Compute Loss

parameters = [ {'params': [gaussians.pre_act_opacities], 'lr': 0.05, "name": "opacities"}, {'params': [gaussians.pre_act_scales], 'lr': 0.002, "name": "scales"}, {'params': [gaussians.colours], 'lr': 0.02, "name": "colours"}, {'params': [gaussians.means], 'lr': 0.00005, "name": "means"}, {'params': [gaussians.pre_act_quats], 'lr': 0.02, "name": "quats"}, ]
The above were the learning rates used for the best results. Trained for 2000 iterations.
PSNR: 29.336
SSIM: 0.932
Training Progress
Training progress
Final Renders
Final renders

Question 1.3.1: Rendering Using Spherical Harmonics

View dependent
View dependent rendering
No view dependence
No view dependence rendering
Frame 1 - No view dependence
Frame 1 - No view dependence
Frame 1 - View dependent
Frame 1 - View dependent
Frame 2 - No view dependence
Frame 2 - No view dependence
Frame 2 - View dependent
Frame 2 - View dependent
The main differences I noticed are that in the spherical harmonic case, I can clearly notice that the green cushion is shiny and the texture appears to be more discernible. The former observation is naturally explained by the view-dependence enabled by the spherical harmonics. The latter observation is probably due to the fact that the texture is easily discernible using albedo and specularity than with the albedo alone.

2. Diffusion-guided Optimization

Question 2.1: SDS Loss + Image Optimization (20 points)

a hamburger
With guidance
Hamburger with guidance
Without guidance
Hamburger without guidance
a standing corgi dog
With guidance
Corgi with guidance
Without guidance
Corgi without guidance
a sleepy kitten
With guidance
Kitten with guidance
Without guidance
Kitten without guidance
a standing penguin
With guidance
Penguin with guidance
Without guidance
Penguin without guidance
All of the above results were obtained by training for 2000 iterations.

Question 2.2: Texture Map Optimization for Mesh

a dotted black and white cow
Dotted black and white cow mesh
an orange golden bull
Orange golden bull mesh

Question 2.3: NeRF Optimization

a hamburger
RGB
Depth
a standing corgi dog
RGB
Depth
a sleepy orange cat
RGB
Depth

Question 2.4.1: View-dependent text embedding

a standing corgi dog
RGB
Depth
a sleepy orange cat
RGB
Depth
If we look at the results produced by the same prompts without view-dependent conditioning in the previous section, the corgi and the cat have 3 ears each. This is because it is implicitly trying to make every view look similar to a typical front-facing version of the corgi/cat. This issue has been resolved by using view-dependent conditioning.