16-825: Learning for 3D Vision HW1
Daksh Adhar · Sep 2025

16-825 Assignment 1: Rendering Basics with PyTorch3D (Total: 100 Points + 10 Bonus)

1. Practicing with Cameras (15 Points)

1.1. 360-degree Renders (5 points)

A 360-degree gif video that shows many continuous views of the provided cow mesh.

Cow being rendered from different angles

1.2 Re-creating the Dolly Zoom (10 points)

The Dolly Zoom is a famous camera effect, first used in the Alfred Hitchcock film Vertigo. The core idea is to change the focal length of the camera while moving the camera in a way such that the subject is the same size in the frame, producing a rather unsettling effect.

gif with dolly zoom effect

2. Practicing with Meshes (10 Points)

2.1 Constructing a Tetrahedron (5 points)

In this part, you will practice working with the geometry of 3D meshes. Construct a tetrahedron mesh and then render it from multiple viewpoints.

A 360-degree gif animation of a tetrahedron

Since we have a tetrahedron, which has four triangular faces, four vertices and six edges, we can simply define the vertices as points of the mesh and each face of the tetrahedron becomes the triangular face of a mesh. This can easily be defined as

vertices=torch.tensor([
                  [0,0,0],
                  [2,0,0],
                  [1,0,2],
                  [1,2,1]], dtype=torch.float32)
      faces=torch.tensor([
                  [0,1,2],
                  [2,3,0],
                  [2,1,3],
                  [1,0,3]], dtype=torch.int64)

2.2 Constructing a Cube (5 points)

Constructing a cube mesh is not as direct as tetrahedron because each face of cube cannot directly become the face of mesh. We would rather need to define two triangles in each square face of cube. The number of vertices of cube are 8 and faces are 6 but since each face is square we need to define triangular faces in each square to be used for mesh generation. So two triangular faces in each square accounting to 12 triangular faces in total

A 360-degree gif animation of a cube

This is defined as

vertices=torch.tensor([
          [0,0,0],
          [1,0,0],
          [1,0,1],
          [0,0,1],
          [0,1,0],
          [1,1,0],
          [1,1,1],
          [0,1,1]], dtype=torch.float32)
      faces=torch.tensor([
          [0,4,5],
          [0,1,5],
          [2,1,5],
          [2,6,5],
          [3,2,6],
          [3,7,6],
          [3,0,4],
          [3,7,4],
          [7,4,5],
          [7,6,5],
          [3,0,1],
          [3,2,1]], dtype=torch.int64)

3. Re-texturing a mesh (10 points)

The front of the cow corresponds to the vertex with the smallest z-coordinate z_min, and the back of the cow corresponds to the vertex with the largest z-coordinate z_max. Then, we will assign the color of each vertex using linear interpolation based on the z-value of the vertex:

alpha = (z - z_min) / (z_max - z_min)
      color = alpha * color2 + (1 - alpha) * color1

The final output looks something like this:

Retextured cow render

In this case, color1 = [1, 0, 0] i.e. RED and color2 = [0, 1, 0] i.e. GREEN.

4. Camera Transformations (10 points)

When working with 3D, finding a reasonable camera pose is often the first step to producing a useful visualization, and an important first step toward debugging.

Running python -m starter.camera_transforms produces the following image using the camera extrinsics rotation R_0 and translation T_0:

Cow render

Since we are pre-multyplying the matrices by relative ones, R_relative would represent camera being rotated around its own optical axiz. And T would represent it’s the world-space position of the camera centre after accounting for both new rotation and any extra translation.

A set (R_relative, T_relative) such that the new camera extrinsics with R = R_relative @ R_0 and T = R_relative @ T_0 + T_relative produces each of the following images are:

Cow render
Cow render
Cow render
Cow render

5. Rendering Generic 3D Representations (45 Points)

5.1 Rendering Point Clouds from RGB-D Images (10 points)

In this part, we will practice rendering point clouds constructed from 2 RGB-D images from the Common Objects in 3D Dataset.

View 1
View 2
Union of both

5.2 Parametric Functions (10 + 5 points)

A parametric function generates a 3D point for each point in the source domain. For example, given an elevation theta and azimuth phi, we can parameterize the surface of a unit sphere as (sin(theta) * cos(phi), cos(theta), sin(theta) * sin(phi)).

By sampling values of theta and phi, we can generate a sphere point cloud.

Now we will render a torus point cloud by sampling its parametric function.

    x = (R + r*torch.sin(Theta)) * torch.cos(Phi)
          y = (R + r*torch.sin(Theta)) * torch.sin(Phi)
          z = r*torch.cos(Theta)

Where Theta and Phi can take any value from 0 to 2*pi

Torus from parametric definition
r_oct = 1.0 / (torch.abs(torch.sin(Theta) * torch.cos(Phi)) + 
                  torch.abs(torch.sin(Theta) * torch.sin(Phi)) + 
                  torch.abs(torch.cos(Theta)))

      x = r_oct * torch.sin(Theta) * torch.cos(Phi)
      y = r_oct * torch.sin(Theta) * torch.sin(Phi)
      z = r_oct * torch.cos(Theta)
      

Where theta can take values between 0 and pi and phi can take values between 0 to 2*pi

Octahedron from parametric definition

5.3 Implicit Surfaces (15 + 5 points)

In this part, we will explore representing geometry as a function in the form of an implicit function. In general, given a function F(x, y, z), we can define the surface to be the zero level-set of F i.e. (x,y,z) such that F(x, y, z) = 0. To visualize such a representation, we can discretize the 3D space and evaluate the implicit function, storing the values in a voxel grid. Finally, to recover the mesh, we can run the marching cubes algorithm to extract the 0-level set.

X, Y, Z = torch.meshgrid([torch.linspace(min_value, max_value, voxel_size)] * 3)
      voxels = (torch.sqrt(X**2 + Y**2) - R)**2 + Z**2 - r**2
Torus as an implicit surface
X, Y, Z = torch.meshgrid(torch.linspace(-2, 2, 100), 
                              torch.linspace(-2, 2, 100), 
                              torch.linspace(-2, 2, 100))
      voxels = torch.abs(X) + torch.abs(Y) + torch.abs(Z) <= 1
          
Octahedron as an implicit surface

Comparision between rendering as a mesh vs as a point cloud


6. Do Something Fun (10 points)

Now that you have learned to work with various 3D represenations and render them, it is time to try something fun. Create your own 3D structures, or render something in an interesting way, or creatively texture, or anything else that appeals to you - the (3D) world is your oyster! If you wish to download additional meshes, Free3D is a good place to start.

6.1 Disco Cow Animation

For this section, I created a disco cow animation that combines multiple rendering techniques:

The animation creates a disco effect where the cow and floor are illuminated by moving colored lights, creating a disco-like visualization.

Disco cow animation