Name: [Minghao Xu] | Andrew ID: [mxu3]
Deliverable: GIF rendered from a pre-trained Gaussian model.
I implemented the core 3D Gaussian rasterization pipeline, including projection, alpha/opacity calculation, and final blending. The GIF below shows the result.
Observation: The rendering output technically confirms the successful implementation of the core 3D Gaussian Splatting pipeline. The clear color gradients in the depth map (blue = near, yellow = far) prove that the 3D Gaussians were correctly projected and that the depth-based sorting mechanism is functional. The clean, sharp boundary shown in the mask/silhouette image confirms that the opacity calculations and the volumetric accumulation formula (transmittance and final color blending) are correctly executed.
Deliverable: GIF showing the final rendered toy truck after training.
We trained the 3D Gaussian representation for the toy truck using isotropic Gaussians initialized from a point cloud. Training ran for 1000 iterations with a **differential learning rate strategy** for fast and stable convergence.
| Parameter | Learning Rate |
|---|---|
| opacities | 0.001 |
| scales | 0.003 |
| colours | 0.02 |
| means | 0.01 |
Trained Iterations: 1000
Mean PSNR: 29.811
Mean SSIM: 0.939
Final Render GIF
Training Progress GIF
Deliverables:
render.py for questions 1.3.1 (SH Rendering) and 1.1.5 (Base Rendering).I extended the base 3D Gaussian rasterizer from Q1.1.5 to incorporate **Spherical Harmonics (SH)** for the color contribution of each Gaussian. This change models **view-dependent lighting effects** (such as highlights and reflections), significantly enhancing realism beyond the view-independent fixed color model.
Observation (GIF Summary): The Q1.3.1 GIF demonstrates a clear jump in fidelity over the Q1.1.5 base render. While both GIFs show the same geometry (depth and silhouette are preserved), the SH render exhibits **dynamic specular highlights** and **subtle, smooth shading transitions** as the viewpoint changes. The base render, by contrast, appears uniformly lit and flat, validating that the integration of higher-order SH coefficients successfully simulates view-dependent reflection and lighting.
Deliverable: Four optimized images showing the effect of Classifier-Free Guidance (CFG).
The SDS loss function was implemented. We compare the results of optimizing a latent vector with and without CFG (guidance scale > 1).
Deliverable: Two GIFs showing the final textured mesh views.
I optimized the texture map of the provided cow mesh using the SDS loss, demonstrating photorealistic texture generation guided by text prompts.
Deliverable: Three pairs of RGB and Depth videos for different prompts. (Displayed below as GIFs)
Deliverable: RGB and Depth videos with view-dependent conditioning compared with Q2.3 results.
The DreamFusion paper proposes view-dependent text embedding to achieve better 3D consistency by conditioning the diffusion model on the viewing angle. This addresses issues like multiple front-facing features (e.g., overlapping ears) that occur when optimizing each view independently.
RGB
Depth
RGB
Depth
Observation: Enabling view-dependent (VD) embedding effectively resolved the issue of multiple overlapping ears observed in the Q2.3 corgi model. The VD version shows more coherent geometry and a cleaner, more realistic silhouette as the viewing angle changes. The depth map also appears more consistent and uniform across different views.
RGB
Depth
RGB
Depth
Observation - Challenging Geometry:
The tuna fish exhibits the most severe multi-view inconsistency artifacts among all tested objects, with visible duplicate tail fins appearing from opposite viewing angles. This occurs even with view-dependent conditioning and highlights fundamental challenges in 3D reconstruction of objects with specific geometric properties.
Why Tuna Fish is Particularly Difficult:
Comparison with Other Objects:
This example demonstrates the limitations of view-independent SDS optimization for highly directional, asymmetric objects and suggests that additional geometric constraints or progressive training strategies may be necessary for such cases.
Key Findings Across Different Object Types:
Future Directions: For challenging objects like the tuna fish, potential improvements could include (1) progressive training with directional constraints, (2) density regularization to prevent feature duplication, (3) explicit symmetry-breaking losses, or (4) incorporating shape priors from 3D model datasets.