Prompt: “a hamburger”
CMU 16‑825 • Vaishnavi Khindkar (vkhindka@andrew.cmu.edu)
We render the pre‑trained chair.ply (original 3DGS export). The panel shows RGB, coloured depth (jet), and silhouette. Camera is placed on a circle (32 views), white background.
q1_render.gif.python render.py --data_path data/chair.ply --out_path output --device cuda --gaussians_per_splat 2048 --img_dim 256
We train isotropic Gaussians from an input point cloud for the toy‑truck multi‑view dataset.
viz_freq iters.
1e-35e-31e-25e-22000204831.4550.955python Q1/train.py --device cuda --num_itrs 2000 --gaussians_per_splat 2048 --viz_freq 50We enable SH‑based view‑dependent appearance (degree 3) by loading SH coefficients from chair.ply and evaluating colours per‑view. Compared against the DC‑only (0th‑order) rendering.
output1.3/q1_render_with_sh.gif.
python render.py --data_path data/chair.ply --out_path output1.3 --device cuda --gaussians_per_splat 2048 --img_dim 256
Ensure: model.py + data_utils.py SH blocks enabled
Baseline: isotropic Gaussians with random initialization on NeRF‑Synthetic materials (128×128). No SH training; same renderer and loss as 1.2.
3000 (example run)-1 (one‑shot) or 2048 for lower memorypython train_harder_scene.py --data_path ./data/materials --out_path ./output_harder --device cuda --gaussians_per_splat -1 --num_itrs 3000 --viz_freq 5018.4120.708(Baseline)
We optimize 3D content from text via Score Distillation Sampling (SDS), first on a triangle mesh, then on a NeRF, and finally explore extensions (view-dependent text, other 3D reps, pixel-space variants). Guidance comes from Stable Diffusion 2.1; we apply simple regularizers (alpha entropy, orientation).
Implemented SDS in SDS.py (with & without classifier-free guidance). Below we show
results for two prompts (“hamburger”, “standing corgi dog”) under both settings.
We fix the geometry and optimize a per-vertex color field (via PyTorch3D) using SDS on randomly sampled camera/light. Two example prompts shown below (final turntables).
ColorField: (1, Nv, 3) RGB from vertex xyz; textures via TexturesVertex.We optimize a NeRF from text using SDS (SD-2.1). Below we show a 360° RGB render and the corresponding depth video for three prompts.
| Component | Key Params |
|---|---|
| Optimizer | Adan (lr 5e-3 × base, wd 2e-5, max_grad_norm 5.0), AMP on |
| SDS | Guidance 50–100 (decay late), SD-2.1 |
| Regularizers | Entropy λ=1e-2, Orientation λ=1e-3 (when available) |
| Rendering | Res 256→384→512; random bg / shading after latent warm-up |