16-825: Assignment 4 – 3D Gaussian Splatting and Diffusion-Guided Optimization

Sagar Chandrashekhar Bellad | Andrew ID: sbellad

I have referred online documentation, StackOverflow, and used GPT — not to copy the code blindly or directly, but to understand a concept or code that was new to me or that I did not fully understand.

Q1. 3D Gaussian Splatting

1.1.1 - 1.1.5 Rendering Pre-trained Gaussians (all unit test cases passed)

Q1.1.5 Rendered Scene (View-Independent)

1.2 Training 3D Gaussian Representations (15 points)

Learning Rate Configuration

The following learning rates produced the best performance after experimenting with multiple configurations:

Parameter Learning Rate Description
pre_act_opacities 0.0007 Controls transparency; smaller value stabilizes alpha updates.
pre_act_scales 0.010 Determines Gaussian size; moderate learning rate for smooth shape growth.
colours 0.020 Controls RGB appearance; slightly higher rate accelerates color convergence.
means 0.0003 Updates 3D positions; small rate prevents instability in geometry.

Training Details

Training Outputs

Training Progress and Final Renderings for 3D Gaussian Representation

Among the tested configurations, this learning rate set produced the most stable convergence and visually accurate reconstruction. Opacity and mean updates benefited from lower learning rates to prevent flickering or instability, while higher rates for colors and scales accelerated appearance fitting.

1.3.1 Rendering with Spherical Harmonics

Q1.1.5
Q1.3.1 With Spherical Harmonics (View-Dependent)

Observations and Differences:

Explanation: Spherical harmonics let the color change with the viewing direction. Without them, the color stays fixed and looks flat. With them, lighting effects like reflections and shading are captured, making the object look more natural and detailed.

1.3.2 Harder Scene (Materials Dataset)

Disclaimer: All experiments follow the baseline setup from Question 1.2.2, with isotropic Gaussians and identical training parameters unless stated otherwise.

Training Details

Comparison & Analysis

Setup Gaussian Type PSNR SSIM
Baseline Isotropic 16.949 0.639
Improved Anisotropic 28.586 0.934

Explanation of Improvements

The improved setup switches from isotropic to anisotropic Gaussians, allowing each Gaussian to represent directional variation in 3D space, improving surface fidelity and material detail reconstruction. Additionally, learning rates were fine-tuned to balance colour and scale updates, preventing over-smoothing in early iterations. This led to a significant boost of ~11.6 PSNR and ~0.29 SSIM.

Harder Scene – Training and Final Renders

Q2. Diffusion-Guided Optimization

2.1 Image Optimization (SDS Loss)

2.2 Texture Map Optimization for Mesh

Mesh Texture Optimization – “Cow covered in iridescent rainbow” and “Golden metallic cow statue with reflective surface”

2.3 NeRF Optimization

2.4.1 View-Dependent Text Embeddings

Additional Observation: With view-dependent text embeddings, the results look brighter and more consistent across different views. For example, in the potted plant scene, the leaves appear slightly disconnected from the pot without view-dependence, but with it, the geometry and colors stay aligned and look much more natural.