Course: 16-825 Learning For 3D Vision 
        Author: Karthik Pullalarevu 
		Andrew ID: kpullala
    
In this assignment, you will learn the basics of rendering with PyTorch3D, explore 3D representations, and practice constructing simple geometry.
To create a 360-degree render of an object, I followed these steps:
By keeping the camera's elevation fixed at 10.0 and its distance from the object at 2.7, I generated 36 distinct views by varying the azimuthal angle in 10-degree increments.
# Set elevation and azimuth of the views
for angle in range(0, 360, 10):
    R, T = pytorch3d.renderer.look_at_view_transform(dist=2.7, elev=10.0, azim=angle)
    rendered_image = render_cow(cow_path = '/home/karthik/Depth-Anything-V2/lf3/assignment1/data/cow.obj'
            ,device = 'cuda', R=R, T=T)
    all_images.append(rendered_image)
my_images = [np.array((img * 255).astype(np.uint8)) for img in all_images]
duration = 1000 // 15  # Convert FPS (frames per second) to duration (ms per frame)
imageio.mimsave('cow_rotation.gif', my_images, duration=duration, loop=0) 
        The dolly zoom is a classic cinematic effect that changes the camera's field of view (FoV) while simultaneously moving the camera to keep the subject the same size in the frame. To recreate this, I increased the FoV over time while moving the camera closer to the object according to the formula: $distance = \frac{1.8 \times 10^4}{fov^2}$.
fovs = torch.linspace(5, 120, num_frames)
for fov in tqdm(fovs):
    distance = (1.8*10000)/(fov ** 2)
    T = [[0, 0, distance]]
    cameras = pytorch3d.renderer.FoVPerspectiveCameras(fov=fov, T=T, device=device) 
        A tetrahedron is a polyhedron with 4 vertices and 4 triangular faces. I constructed it using the following vertices and face indices. The camera is set to look at the center of the mesh.
vertices = torch.tensor([[1,2,1.5], [2,0,2], [-2, 0, 2], [0,0,0]], dtype = torch.float32) * 0.25
faces = torch.tensor([[0,1,2], [0,2,3], [0,1,3], [1,2,3]], dtype = torch.int64)
# Set the camera to look at the center of the tetrahedron
R, T = pytorch3d.renderer.look_at_view_transform(
    dist=2.7,
    elev=10.0,
    azim=angle,
    at=vertices.mean(0, keepdims=True)
) 
        A cube mesh can be constructed from 8 vertices and 12 triangular faces (where each of the 6 square sides is made of two triangles).
vertices = torch.tensor([[1,1,1], [3,1,1], [3,3,1], [1,3,1],
                         [1,1,3], [3,1,3], [3,3,3], [1,3,3]], dtype = torch.float32) * 0.25
faces = torch.tensor([[0,1,2], [0,2,3], [4,5,6], [4,6,7],
                      [0,1,5], [0,5,4], [2,3,7], [2,7,6],
                      [1,2,6], [1,6,5], [0,3,7], [0,7,4]], dtype = torch.int64)
# Set the camera to look at the center of the cube
R, T = pytorch3d.renderer.look_at_view_transform(
    dist=2.7,
    elev=10.0,
    azim=angle,
    at=vertices.mean(0, keepdims=True)
) 
        I re-textured the cow mesh by applying a color gradient based on the z-coordinate of each vertex. I assigned green (0, 1, 0) to vertices with the minimum z-value and blue (0, 0, 1) to vertices with the maximum z-value, with colors smoothly interpolated in between.
# Get the z-coordinates of the vertices
z = vertices[0, :, 2]
# Normalize the z-coordinates to create an alpha value for interpolation
alpha = (z - z.min()) / (z.max() - z.min())
alpha = alpha[:, None]
# Define the two colors for the gradient
color1 = torch.tensor([0., 1., 0.], device=z.device) # Green
color2 = torch.tensor([0., 0., 1.], device=z.device) # Blue
# Interpolate between the two colors
color  = alpha * color2 + (1 - alpha) * color1 
        This same technique can be used to visualize the X and Y coordinate systems as well.
| X-Axis Visualization | Y-Axis Visualization | 
|---|---|
|   |   | 
This task involves applying transformations to the camera to change the object's appearance in the rendered image. PyTorch3D uses a coordinate system where +X is left, +Y is up, and +Z is forward (out of the screen).
 
        To rotate the cow 90 degrees about the camera's Z-axis, I applied the following relative rotation.
R_rel = [[0, -1, 0], [-1, 0, 0], [0, 0, 1]]
T_rel = [0, 0, 0] 
        To make the cow appear further away, I moved the camera backward by increasing its Z-position.
R_relative = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
T_relative = [0, 0, 3] 
        To move the cow to the bottom-right of the frame, I moved the camera to the left (+X direction) and up (-Y direction).
R_relative = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
T_relative = [0.4, -0.6, 0] 
        To view the cow from the side, I rotated the camera 90 degrees about its Y-axis.
R_relative = [[0, 0, -1], [0, 1, 0], [-1, 0, 0]]
T_relative = [0, 0, 0] 
        I constructed 3D point clouds by "unprojecting" pixels from two RGB-D images into 3D space. Each pixel's color and depth value were used to calculate its (X, Y, Z) coordinate.
points, rgb = unproject_depth_image(
    torch.from_numpy(data['rgb1']),
    torch.from_numpy(data['mask1']),
    torch.from_numpy(data['depth1']),
    data['cameras1']
)The results below show the point cloud from the first view, the second view, and a combined view.
| View 1 | View 2 | Combined | 
|---|---|---|
|   |   |   | 
I generated a point cloud of a torus using its parametric equations. The major radius ($R_{tor}$) is the distance from the center of the tube to the center of the torus, and the minor radius ($r_{tor}$) is the radius of the tube.
# Define angles for sampling
phi = torch.linspace(0, 2 * np.pi, num_samples)
theta = torch.linspace(0, 2 * np.pi, num_samples)
Phi, Theta = torch.meshgrid(phi, theta)
# Torus parametric equations
R_tor = 1.0
r_tor = 0.5
x = torch.cos(Phi) * (R_tor + r_tor * torch.cos(Theta))
y = torch.sin(Phi) * (R_tor + r_tor * torch.cos(Theta))
z = r_tor * torch.sin(Theta)
points = torch.stack((x.flatten(), y.flatten(), z.flatten()), dim=1)
color = (points - points.min()) / (points.max() - points.min())(The density of the point cloud increases with the number of samples.)
| 50 Samples  | 100 Samples  | 
| 250 Samples  | 500 Samples  | 
I created a mesh of a torus from an implicit function using the marching cubes algorithm. This method defines the surface as the set of points where a function equals a specific value (the isovalue).
# Create a grid of points (voxels)
min_value = -1.6
max_value = 1.6
X, Y, Z = torch.meshgrid([torch.linspace(min_value, max_value, voxel_size)] * 3)
# Implicit function for a torus
R_tor = 1.0
r_tor = 0.5
voxels = ((X * X + Y * Y) ** 0.5 - R_tor) ** 2 + (Z * Z - r_tor * r_tor)
# Extract the mesh using marching cubes
vertices, faces = mcubes.marching_cubes(mcubes.smooth(voxels), isovalue=0)Tradeoffs between point clouds and meshes:
(The mesh quality improves with a higher voxel grid resolution.)
| Voxel Size: 8  | Voxel Size: 16  | 
| Voxel Size: 32  | Voxel Size: 64  | 
For this task, I constructed an airplane using several parametric shapes, including an ellipsoid for the fuselage and rectangular prisms for the wings and tail. I colored each component differently and created a dynamic animation by moving the camera to simulate a fly-by, complete with a dolly zoom effect.
def render_airplane(image_size=256, num_samples=100, device=None, R=None, T=None):
    """
    Renders a simple airplane using parametric sampling.
    Components: fuselage (ellipsoid), wings (rectangular), tail surfaces.
    """
    
    if device is None:
        device = get_device()
    
    points_list = []
    
    # 1. Fuselage (elongated ellipsoid)
    u = torch.linspace(0, 2 * np.pi, num_samples // 2)
    v = torch.linspace(0, np.pi, num_samples // 4)
    U, V = torch.meshgrid(u, v)
    
    fuselage_length = 4.0
    fuselage_width = 0.8
    fuselage_height = 0.6
    
    x_fus = fuselage_length * torch.cos(V) * torch.cos(U)
    y_fus = fuselage_width * torch.cos(V) * torch.sin(U)
    z_fus = fuselage_height * torch.sin(V)
    
    fuselage_points = torch.stack((x_fus.flatten(), y_fus.flatten(), z_fus.flatten()), dim=1)
    points_list.append(fuselage_points)
    
    # 2. Main wings (rectangular surfaces with taper)
    wing_span = 8.0
    wing_chord_root = 1.5
    wing_chord_tip = 0.8
    wing_position_x = 0.5  # Position along fuselage
    
    # Wing surface parameterization
    wing_u = torch.linspace(-wing_span/2, wing_span/2, num_samples // 3)
    wing_v = torch.linspace(0, 1, num_samples // 8)
    Wing_U, Wing_V = torch.meshgrid(wing_u, wing_v)
    
    # Tapered wing chord
    chord_at_span = wing_chord_root + (wing_chord_tip - wing_chord_root) * (torch.abs(Wing_U) / (wing_span/2))
    
    x_wing = wing_position_x + Wing_V * chord_at_span
    y_wing = Wing_U
    z_wing = torch.zeros_like(Wing_U) + 0.1  # Slight dihedral
    
    wing_points = torch.stack((x_wing.flatten(), y_wing.flatten(), z_wing.flatten()), dim=1)
    points_list.append(wing_points)
    
    # 3. Horizontal tail
    tail_span = 2.0
    tail_chord = 0.8
    tail_position_x = -3.5
    
    tail_u = torch.linspace(-tail_span/2, tail_span/2, num_samples // 6)
    tail_v = torch.linspace(0, 1, num_samples // 12)
    Tail_U, Tail_V = torch.meshgrid(tail_u, tail_v)
    
    x_tail = tail_position_x + Tail_V * tail_chord
    y_tail = Tail_U
    z_tail = torch.zeros_like(Tail_U) + 0.8  # Elevated tail
    
    tail_points = torch.stack((x_tail.flatten(), y_tail.flatten(), z_tail.flatten()), dim=1)
    points_list.append(tail_points)
    
    # 4. Vertical tail
    vtail_height = 1.5
    vtail_chord = 0.6
    
    vtail_u = torch.linspace(0, vtail_height, num_samples // 8)
    vtail_v = torch.linspace(0, 1, num_samples // 12)
    VTail_U, VTail_V = torch.meshgrid(vtail_u, vtail_v)
    
    x_vtail = tail_position_x + VTail_V * vtail_chord
    y_vtail = torch.zeros_like(VTail_U)
    z_vtail = VTail_U + 0.3
    
    vtail_points = torch.stack((x_vtail.flatten(), y_vtail.flatten(), z_vtail.flatten()), dim=1)
    points_list.append(vtail_points)
    
    # Combine all points
    points = torch.cat(points_list, dim=0)
    
    # Color coding by component
    num_fuselage = fuselage_points.shape[0]
    num_wing = wing_points.shape[0]
    num_htail = tail_points.shape[0]
    num_vtail = vtail_points.shape[0]
    
    # Create color features (RGB for different components)
    colors = torch.zeros(points.shape[0], 3)
    colors[:num_fuselage] = torch.tensor([0.7, 0.7, 0.9])  # Light blue fuselage
    colors[num_fuselage:num_fuselage+num_wing] = torch.tensor([0.9, 0.7, 0.7])  # Light red wings
    colors[num_fuselage+num_wing:num_fuselage+num_wing+num_htail] = torch.tensor([0.7, 0.9, 0.7])  # Light green tail
    colors[num_fuselage+num_wing+num_htail:] = torch.tensor([0.9, 0.9, 0.7])  # Light yellow vertical tail
    
    airplane_point_cloud = pytorch3d.structures.Pointclouds(
        points=[points], features=[colors],
    ).to(device)
    
    # cameras = pytorch3d.renderer.FoVPerspectiveCameras(R=R, T=T, device=device)
    # renderer = get_points_renderer(image_size=image_size, device=device, background_color=(0, 0, 0))
    # rend = renderer(airplane_point_cloud, cameras=cameras)
    from tqdm import tqdm 
    
    fovs = torch.linspace(5, 120, 10)
    renders = []
    for fov in tqdm(fovs):
        distance = (8.0*10000)/(fov ** 2)  # TODO: change this.
        T = [[0, 0, distance]]  # TODO: Change this.
        cameras = pytorch3d.renderer.FoVPerspectiveCameras(fov=fov, T=T, device=device)
        renderer = get_points_renderer(image_size=image_size, device=device, background_color=(0, 0, 70))
        rend = renderer(airplane_point_cloud, cameras=cameras)
        rend = rend[0, ..., :3].cpu().numpy()  # (N, H, W, 3)
        renders.append(rend)
    from PIL import Image, ImageDraw
    import imageio
    images = []
    for i, r in enumerate(renders):
        image = Image.fromarray((r * 255).astype(np.uint8))
        draw = ImageDraw.Draw(image)
        draw.text((20, 20), f"fov: {fovs[i]:.2f}", fill=(255, 0, 0))
        images.append(np.array(image))
    imageio.mimsave('aeroplane_dolly_zoom.gif', images, duration=10, loop=0, fps=10)
