Assignment 4¶
author: Yu Jin Goh (yujing)
Q1.1 3D Gaussian Rasterization¶
Rendered GIF:

Q1.2 Training 3D Gaussian Representations¶
Learning rates parameters used:
| Learning Rate Variable | Value |
| :------- | :------: |
| pre_act_opacities | 0.0005 |
| pre_act_scales | 0.005 |
| colours | 0.005 |
| means | 0.0005 |
Number of training iterations: 1000
| Metric | Value |
|---|---|
| PSNR | 29.446 |
| SSIM | 0.937 |
Training progress:

Final Render:

Q1.3.1 Rendering Using Spherical Harmonics¶
Previous Render (Q1.1.5):

Now with view-dependent effects (Q1.3.1):

| No View Dependence | View Dependence | Explanations |
|---|---|---|
![]() |
![]() |
You can observe that the patterns on the chair were originally yellow and diffuse when no view dependent effects were included but became gold as the different parts of the patterns on the chair emitted different colors based on the viewing angle once view dependent effects were accounted for |
![]() |
![]() |
In this view you can also observe that the decorations on the chair have highlights once view dependent effects are included and shadows on parts of the cloth that vary with viewing angle |
Q2.1 SDS Loss + Image Optimization¶
| Prompt | No Guidance | With Guidance |
|---|---|---|
| a hamburger | ![]() |
![]() |
| a standing corgi dog | ![]() |
![]() |
| a unicorn | ![]() |
![]() |
| a flying fire breathing dragon | ![]() |
![]() |
Q2.2 Texture Map Optimization for Mesh¶
| Prompt | Generated Texture |
|---|---|
| a white dairy cow with pink nose and black spots | ![]() |
| a fearsome orange and black striped tiger | ![]() |
2.3 NeRF Optimization¶
| Prompt | RGB | Depth |
|---|---|---|
| a standing corgi dog | ![]() |
![]() |
| a white unicorn | ![]() |
![]() |
| a tiger | ![]() |
![]() |
2.4.1 View-dependent text embedding¶
| Prompt | RGB | Depth |
|---|---|---|
| a standing corgi dog | ![]() |
![]() |
| a white unicorn | ![]() |
![]() |
Its clear that the view-dependent text conditioning helps to improve the generated volume. As we can see in the generated radiance fields, the corgi now have only 2 ears instead of 3 and the unicorn has one horn instead of two. This is becuase the images are now scored against a more likely observation of the prompt from the corresponding view.
In [ ]:























