Assignment 2 - 3D Graphics

Question 1.1: Fitting a voxel grid

Ground Truth (Source)

Optimized (Target)

Question 1.2: Point cloud

Ground Truth (Source)

Optimized (Target)

Question 1.3: Mesh

Ground Truth (Source)

Optimized (Target)

Question 2.1: Image to voxel grid

Input RGB

Predicted Voxel Grid

Ground Truth Mesh

Input RGB

Predicted Voxel Grid

Ground Truth Mesh

Input RGB

Predicted Voxel Grid

Ground Truth Mesh

Question 2.2: Image to point cloud

Input RGB

Predicted Point Cloud

Ground Truth Mesh

Input RGB

Predicted Point Cloud

Ground Truth Mesh

Input RGB

Predicted Point Cloud

Ground Truth Mesh

Question 2.3: Image to mesh

Input RGB

Predicted Mesh

Ground Truth Mesh

Input RGB

Predicted Mesh

Ground Truth Mesh

Input RGB

Predicted Mesh

Ground Truth Mesh

Question 2.4: Quantitative comparisons

Voxel Grid F1 Score

Point Cloud F1 Score

Mesh F1 Score

Clearly, the F1 score for the point network is the highest. Intuitively, this is because the F1 scores are computed on the set of points directly predicted by the point network. Whereas, for the voxel and the mesh networks, points are sampled from the predicted representations to compute the F1 score. As there is no intermediate representation between the prediction and the F1 score computation for the point network, we would expect that to obtain the highest F1 score. The mesh network has to satisfy the smoothness constraint which might serve as a good regularizer, which makes it achieve a marginally higher F1 score than the voxel network.

Question 2.5: Analyze effects of hyperparameter variations

Ground Truth

Predicted with w_smooth=0.1

Predicted with w_smooth=0.2

Ground Truth

Predicted with w_smooth=0.1

Predicted with w_smooth=0.2

Evaluation w_smooth=0.1

Evaluation w_smooth=0.2

Clearly the quality if the mesh reconstructions with w_smooth = 0.1 is qualitatively better than with w_smooth = 0.2. The meshes produces in the latter case are blobbier, which is particularly noticeable in the second example. The backrest with w = 0.2 is appreciably thicker. The F1 scores also reinforce the point that over smoothing the mesh predictions generally leads to poorer quality reconstructions.

Question 2.6: Interpret your model

Ground Truth

Predicted Mesh

Ordered color

Ground Truth

Predicted Mesh

Ordered color

Ground Truth

Predicted Mesh

Ordered color

I have chosen to visualize the order of the points predicted by the final layer of the mesh network. Specifically, for every predicted mesh, I have assigned "red" color to the first (0th index) point and "blue" to the last indexed point as indexed by the outputs of the last layer of the mesh network. The color varies linearly from red to blue as we go from the first index of the output layer to the last index.

My naive intuition would have expected the indexing of the output layer to be reflected in the geometry of the chair, as in, the predicted textured mesh varied from red to blue from top to bottom, or something similar.

As is seen in the visualizations, the initial indices of the output layer almost form a spider web (or a skeleton) around the overall geometry of the chair (as seen in red) and the latter indices fill it up (as seen in blue).

Question 3.3: Extended dataset for training

Ground Truth Point Cloud

Predicted Point Cloud

Predicted Point Cloud (Full Dataset)

Ground Truth Point Cloud

Predicted Point Cloud

Predicted Point Cloud (Full Dataset)

Evaluation (Limited Dataset)

Evaluation (Full Dataset)

Based on the F1 score curves, the network trained on the full dataset marginally outperforms the one trained on just class both the models were trained on.

The quantitative comparison fails to highlight the difference in performance in particularly hard examples within the class. The visualizations shown were handpicked and are among the harder examples within the "chairs" class. On these examples, the network trained on just the chair class seems to predict better point clouds than the network trained on the full dataset, despite the average F1 score telling a different story.