Assignment 5¶

Author: Yu Jin Goh (yujing)

1. Classification Model¶

Test Accuracy on best model: 97.06%

GT Class Prediction Visual Interpretation
Chair Chair No description has been provided for this image Correctly Predicted
Vase Vase No description has been provided for this image Correctly Predicted
Lamp Lamp No description has been provided for this image Correctly Predicted
Chair Lamp No description has been provided for this image The object is quite tall relative to the width compared to other chairs and so the network predicted a lamp rather than a vase
Vase Lamp No description has been provided for this image The object here has a small base and a big head, which looks more like a lamp than a vase
Lamp Vase No description has been provided for this image The object has a box like shape and so it could be predicted like a vase rather than a lamp

2. Segmentation Model¶

Test Accuracy on best model: 90.78%

Visualize segmentation results of at least 5 objects (including 2 bad predictions) with corresponding ground truth, report the prediction accuracy for each object, and provide interpretation in a few sentences.

Object Index Prediction accuracy Prediction Ground Truth Interpretation
562 99.77% No description has been provided for this image No description has been provided for this image The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest
397 99.67% No description has been provided for this image No description has been provided for this image The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest
505 99.58% No description has been provided for this image No description has been provided for this image The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest
163 43.65% No description has been provided for this image No description has been provided for this image The shape of the chair is not as similar to the standard chairs and also has a head rest adn leg rest. Thus, it is more difficult for the model to predict the segmentation correctly. It also seems that the model assumes that flat horizontal planes tend to correlate to the cushion segmentation as even the leg rest is segmented
255 47.10% No description has been provided for this image No description has been provided for this image The model correctly segmented some parts of the arm rest, back rest and cushion but found it challenging to segment out other parts of the legs, cushion and arm rest. This is because the bottom half of the pointcloud is just a box in shape and it can be hard to disambuguiate between different parts when the geometry has minimal discontinuities
96 50.03% No description has been provided for this image No description has been provided for this image Similar to the above, the model correctly segmented some parts of the arm rest, back rest and cushion but found it challenging to segment out other parts of the legs, cushion and arm rest. This is because the bottom half of the pointcloud is just a box in shape and it can be hard to disambuguiate between different parts when the geometry has minimal discontinuities. The pillow is also hard to segment out as it is not very common in the dataset.

3. Robustness Analysis¶

Rotation¶

Pitch Angle Classification Accuracy Segmentation Accuracy Prediction Ground Truth
0 97.06% (baseline) 90.78% (baseline) No description has been provided for this image No description has been provided for this image
15 96.22% (-0.84%) 87.01% (-3.77%) No description has been provided for this image No description has been provided for this image
30 92.02% (-5.04%) 78.44% (-12.34%) No description has been provided for this image No description has been provided for this image
45 72.61% (-24.45%) 70.86% (-19.92%) No description has been provided for this image No description has been provided for this image
60 62.11% (-34.95%) 63.70% (-27.08%) No description has been provided for this image No description has been provided for this image
90 67.15% (-29.91%) 57.84% (-32.94%) No description has been provided for this image No description has been provided for this image
180 32.63% (-64.43%) 36.47% (-54.31%) No description has been provided for this image No description has been provided for this image

From the experiments, we can see that our network is not robust to yaw rotation. The rotated pointclouds still represent the same object in a plausible pose, but we can see a large pdeccrease in performance with a 64.43% decrease in classification performance and 54.31% decrease in segmentation performance for a 180 degree yaw rotation. This is likely because the dataset has been canonicalized and simplified for point cloud prediction, thus the network is not robust to rotations. Even though a TNet was included for segmentation, it seems that we still need data augmentation to have the network learn more robust features to handle transformations.

Number of Points¶

Number of Points Classification Accuracy Segmentation Accuracy Prediction Ground Truth
10000 97.06% (baseline) 90.78% (baseline) No description has been provided for this image No description has been provided for this image
5000 96.53% (-0.53%) 90.75% (-0.03%) No description has been provided for this image No description has been provided for this image
2000 96.53% (-0.53%) 90.57% (-0.21%) No description has been provided for this image No description has been provided for this image
1000 96.22% (-0.84%) 89.67% (-1.11%) No description has been provided for this image No description has been provided for this image
500 95.91% (-1.15%) 88.40% (-2.38%) No description has been provided for this image No description has been provided for this image
100 93.39% (-3.67%) 81.73% (-9.05%) No description has been provided for this image No description has been provided for this image

It seems that the networks are generally robust to a decrease in number of point samples with performance degrading 0.84% for classification tasks and 1.11% for segmentation tasks when the number of points were decreased by 10x to 1000 points. We can also see that the number of points affects segmentation more with lesser point samples in more extreme cases. When the number of point samples is decreased 100x to 100 points, classification performance drops 3.67% and segmentation performance drops 9.05%. This is likely because the local point features matter more for segmentation than classification.