16-825 Assignment 5

Duc Doan

Q1. Classification Model

Test accuracy: 0.9790136411332634

Image Predicted class Ground truth class
s0 Chair Chair
s1 Vase Vase
s2 Lamp Lamp
f0 Lamp Chair
f1 Lamp Vase
f2 Vase Lamp

Failure cases:

Q2. Segmentation Model

Test accuracy: 0.9044106969205835

ID GT Pred Accuracy
0 gt0 pred0 0.9601
1 gt1 pred1 0.9876
2 gt2 pred2 0.9102
20 gt20 pred20 0.9816
200 gt200 pred200 0.7496
300 gt300 pred300 0.9582
500 gt500 pred500 0.7669

Bad cases:

Q3. Robustness Analysis

Rotation

Procedure: the point clouds are rotated by the same angle around all 3 axes with increasing magnitude.

Angle (rad) Classification accuracy Segmentation accuracy
0.0 0.9790 0.9044
0.1 0.9654 0.8722
0.3 0.8342 0.7320
0.6 0.2455 0.4922

Classification visualization at 0.6 rad:

Image Predicted class Ground truth class
s0 Chair Chair
s1 Vase Vase
s2 Lamp Lamp
f0 Lamp Chair
f1 Lamp Vase
f2 Vase Lamp

Segmentation visualization at 0.6 rad:

ID GT (unrotated) Pred (unrotated) Pred (rotated) Accuracy (unrotated) Accuracy (rotated)
0 gt0 pred0 pr0 0.9601 0.5481
1 gt1 pred1 pr1 0.9876 0.6355
2 gt2 pred2 pr2 0.9102 0.3823
20 gt20 pred20 pr20 0.9816 0.5734
200 gt200 pred200 pr200 0.7496 0.4502
300 gt300 pred300 pr300 0.9582 0.5124
500 gt500 pred500 pr500 0.7669 0.6411

The results are as expected that as the rotation gets larger, the accuracy drops for both classification and segmentation tasks. This is because the model does not have rotation invariance built in (T-net removed) while the data only contain upright poses.

The segmentation visualization shows an interesting thing when there is rotation: the segmentation is as if the object hasn't rotated. This shows that the model learned a mapping from the upright pose coordinates directly to the part labels, instead of learning the true semantic parts of the objects.

Number of points

Procedure: I tested with 5 different number of points, from 10k (original) to 100.

Number of points Classification accuracy Segmentation accuracy
10000 0.9790 0.9044
5000 0.9769 0.9042
1000 0.9748 0.8943
500 0.9664 0.8731
100 0.8909 0.8145

Classification visualization at 100 points:

Image Predicted class Ground truth class
s0 Chair Chair
s1 Vase Vase
s2 Lamp Lamp
f0 Lamp Chair
f1 Lamp Vase
f2 Vase Lamp

Segmentation visualization at 100 points:

ID GT (10k) Pred (10k) Pred (100) Accuracy (10k) Accuracy (100)
0 gt0 pred0 pr0 0.9601 0.9
1 gt1 pred1 pr1 0.9876 0.99
2 gt2 pred2 pr2 0.9102 0.89
20 gt20 pred20 pr20 0.9816 0.96
200 gt200 pred200 pr200 0.7496 0.64
300 gt300 pred300 pr300 0.9582 0.96
500 gt500 pred500 pr500 0.7669 0.8

The results show that the model is quite robust to the number of points for both tasks, even though the accuracy generally drops as the number of points decreases. The deterioration is from the fact that important details of the shape disappear when there are fewer points. However, as long as there are still enough details, the model generally holds its quality. This robustness probably comes from the use of pooling operations, which helps the model focus on the global shape.