Learning for 3D Vision: HW5

Author: Santiago Arambulo
AndrewID: sarambul

1. Classification Model

Test accuracy: 96.75%

Example ID	Ground Truth Class	Predicted Class	Interpretation (only for misclassified examples)
0	Chair	Chair	-
1	Chair	Chair	-
2	Chair	Chair	-
406	Chair	Lamp	I think the model confused this example because the chair seems to be folded. This means that its points does not have a high depth range, as would be expected for a chair.
617	Vase	Vase	-
618	Vase	Vase	-
619	Vase	Lamp	I think the model misclassified this example because the vase had flowers on it. Geometrically, flowers resemble lamps. They have a long thin tube and end with a cylindrical shape at the top.
620	Vase	Vase	-
719	Lamp	Lamp	-
720	Lamp	Lamp	-
721	Lamp	Vase	I think the model misclassified this example because the of the cylindrical shape at the top. This makes the shape resemble a vase. Moreover, because the model uses a maxpool, it might have ignored the empty space in the bottom section.
722	Lamp	Lamp	-

2. Segmentation Model

Test accuracy: 85.66%

Example ID	Accuracy	Interpretation (only for low accuracy examples)
0	92.91%	-
1	96.23%	-
5	90.81%	-
6	94.73%	-
26	49.61%	I think the model had a low accuracy in this example because the ground truth segmentation distinguishes between parts that are geometrically very close. For example, the yellow side of the sofa is being separated from the sitting pillow in red. Meanwhile, model does not make this distinction because this parts have similar 3D positions. Their ground truth segmentation is more based on a semantic understanding of the object.
142	49.04%	Similarly, the ground truth segmentation of this object is also based more on the role each part of the chair plays, rather than geometric position, which is the only thing the PointNet model has access to. Here, the back of the chair (in cyan), is being separated from the arms rests (in yellow), even though they are very close in 3D space. Meanwhile, the model mixes together part of the seat and the back of the chair, and part of the arm rests and the back of the seat, because these parts are close in 3D space.

3. Robustness Analysis

Experiment 1: Random rotations

In this experiment, we apply random rotations to the input point clouds at test time, and evaluate how this affects the performance of the classification and segmentation models. The rotations where generated by randomly sampling Euler angles for each axis from a uniform distribution between 0 and 2π.

Classfication

Baseline test accuracy: 96.75%
Test accuracy under random rotations: 33.47%

Example ID	Ground Truth Class	Predicted Class (without rotation)	Predicted Class (with rotation)	Interpretation (only for misclassified examples)
0	Chair	Chair	Lamp	In this example, the model posibly mistook the rotated cube formed by the legs of the chair and the beams between the legs for a spherical shape, similar to that of lamps. In the original example, many of the points in the legs as beams share similar 3D coordinates, which might have make it easier for the model to focus on the vertices of this cube.
617	Vase	Vase	Vase	I think this example was succesfully classified despite the rotation because the rotation was mild, and even if the object got confused with a spherical shape, there are still many vases that have spherical shapes. Therefore this mistake would not affect the final classification.
719	Lamp	Lamp	Vase	In this example, the model posibly mistook the rotated lamp with a spherical shape, similar to that of some vases. In the original example, many of the points in the long stick that comes out of the bottom share similar 3D coordinates, which might have make it easier for the model to focus on a few of this points and understand them as a thin cylyndrical shape. Meanwhile, in the rotated version, it might have focused on points at the tip of this shape, and together with the points on the piece that goes on the wall, it might have interpreted them as a spherical shape.

Segmentation

Baseline test accuracy: 85.66%
Test accuracy under random rotations: 40.78%

Example ID	Accuracy (without rotation)	Accuracy (with rotation)	Interpretation (only for misclassified examples)
0	92.91%	20.30%	In the rotated example, the points on the back of the chair have very different 3D coordinates. That is why the model missclassified them as belongin to differents segments of the chair. It seems like the model is segmenting base on the difference between the height and depth of the points.
1	96.23%	44.28%	In the rotated example, the legs of the chair now have very different 3D coordinates. Just like before, this is causing the model to confuse them for different segments. In an opposite way, the arms rests are now more similar in 3D coordinates to other parts of the back, so it no longer assigns them different segment labels.
2	81.09%	38.40%	This example also shows how the model is clustering together points that share at least one dimension. For example, in the original object, the legs of the chair had very different x and z values, but similar y values, so they are assigned the same label. Meanwhile, in the rotated example, the legs no longer share similar values in any dimension and get several different labels.

Experiment 2: Number of points

In this experiment, we subsample the number of input point clouds at test time. We use 100, 1,000 and 5,000 points, and evaluate how this affects the performance of the classification and segmentation models. The model was trained using 10,000 points, and this is referred to as the baseline in the tables below.

Classfication

Baseline test accuracy (10,000 points): 96.75%
Test accuracy (100 points): 95.17%
Test accuracy (1,000 points): 96.64%
Test accuracy (5,000 points): 96.75%

Example ID	Ground Truth Class	Predicted Class Baseline (10,000 points)	Predicted Class (5,000 points)	Predicted Class (1,000 points)	Predicted Class (100 points)	Interpretation
0	Chair	Chair	Chair	Chair	Chair	The model is very robust to the number of points. Even with only 100 points, the model is able to correctly classify the chair.
617	Vase	Vase	Vase	Vase	Vase	The model is very robust to the number of points. Even with only 100 points, the model is able to correctly classify the vase.
746	Lamp	Lamp	Lamp	Lamp	Vase	This is one of the few examples where reducing the number of points had an effect on the predicted class. When only using 100 points, the model confuses this for a vase. This possible happen because with so few points, the points start to resemble a spherical shape, similar to that of some vases.

Segmentation

Baseline test accuracy (10,000 points): 85.66%
Test accuracy (100 points): 81.27%
Test accuracy (1,000 points): 85.52%
Test accuracy (5,000 points): 85.70%

Example ID	Ground Truth Segmentation	Predicted Segmentation Baseline (10,000 points)	Predicted Segmentation (5,000 points)	Predicted Segmentation (1,000 points)	Predicted Segmentation (100 points)	Interpretation
0						The model is very robust to the number of points. Even with only 100 points, the model still identifies the segments correctly.
5						The model is very robust to the number of points. Even with only 100 points, the model still identifies the segments correctly.
6						The model is very robust to the number of points. Even with only 100 points, the model still identifies the segments correctly.