Assignment 5¶

Author: Yu Jin Goh (yujing)

1. Classification Model¶

Test Accuracy on best model: 97.06%

GT Class	Prediction	Interpretation
Chair	Chair	Correctly Predicted
Vase	Vase	Correctly Predicted
Lamp	Lamp	Correctly Predicted
Chair	Lamp	The object is quite tall relative to the width compared to other chairs and so the network predicted a lamp rather than a vase
Vase	Lamp	The object here has a small base and a big head, which looks more like a lamp than a vase
Lamp	Vase	The object has a box like shape and so it could be predicted like a vase rather than a lamp

2. Segmentation Model¶

Test Accuracy on best model: 90.78%

Visualize segmentation results of at least 5 objects (including 2 bad predictions) with corresponding ground truth, report the prediction accuracy for each object, and provide interpretation in a few sentences.

Object Index	Prediction accuracy	Interpretation
562	99.77%	The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest
397	99.67%	The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest
505	99.58%	The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest
163	43.65%	The shape of the chair is not as similar to the standard chairs and also has a head rest adn leg rest. Thus, it is more difficult for the model to predict the segmentation correctly. It also seems that the model assumes that flat horizontal planes tend to correlate to the cushion segmentation as even the leg rest is segmented
255	47.10%	The model correctly segmented some parts of the arm rest, back rest and cushion but found it challenging to segment out other parts of the legs, cushion and arm rest. This is because the bottom half of the pointcloud is just a box in shape and it can be hard to disambuguiate between different parts when the geometry has minimal discontinuities
96	50.03%	Similar to the above, the model correctly segmented some parts of the arm rest, back rest and cushion but found it challenging to segment out other parts of the legs, cushion and arm rest. This is because the bottom half of the pointcloud is just a box in shape and it can be hard to disambuguiate between different parts when the geometry has minimal discontinuities. The pillow is also hard to segment out as it is not very common in the dataset.

3. Robustness Analysis¶

Rotation¶

Pitch Angle	Classification Accuracy	Segmentation Accuracy
0	97.06% (baseline)	90.78% (baseline)
15	96.22% (-0.84%)	87.01% (-3.77%)
30	92.02% (-5.04%)	78.44% (-12.34%)
45	72.61% (-24.45%)	70.86% (-19.92%)
60	62.11% (-34.95%)	63.70% (-27.08%)
90	67.15% (-29.91%)	57.84% (-32.94%)
180	32.63% (-64.43%)	36.47% (-54.31%)

From the experiments, we can see that our network is not robust to yaw rotation. The rotated pointclouds still represent the same object in a plausible pose, but we can see a large pdeccrease in performance with a 64.43% decrease in classification performance and 54.31% decrease in segmentation performance for a 180 degree yaw rotation. This is likely because the dataset has been canonicalized and simplified for point cloud prediction, thus the network is not robust to rotations. Even though a TNet was included for segmentation, it seems that we still need data augmentation to have the network learn more robust features to handle transformations.

Number of Points¶

Number of Points	Classification Accuracy	Segmentation Accuracy
10000	97.06% (baseline)	90.78% (baseline)
5000	96.53% (-0.53%)	90.75% (-0.03%)
2000	96.53% (-0.53%)	90.57% (-0.21%)
1000	96.22% (-0.84%)	89.67% (-1.11%)
500	95.91% (-1.15%)	88.40% (-2.38%)
100	93.39% (-3.67%)	81.73% (-9.05%)

It seems that the networks are generally robust to a decrease in number of point samples with performance degrading 0.84% for classification tasks and 1.11% for segmentation tasks when the number of points were decreased by 10x to 1000 points. We can also see that the number of points affects segmentation more with lesser point samples in more extreme cases. When the number of point samples is decreased 100x to 100 points, classification performance drops 3.67% and segmentation performance drops 9.05%. This is likely because the local point features matter more for segmentation than classification.