Assignment 5¶
Author: Yu Jin Goh (yujing)
1. Classification Model¶
Test Accuracy on best model: 97.06%
| GT Class | Prediction | Visual | Interpretation |
|---|---|---|---|
| Chair | Chair | ![]() |
Correctly Predicted |
| Vase | Vase | ![]() |
Correctly Predicted |
| Lamp | Lamp | ![]() |
Correctly Predicted |
| Chair | Lamp | ![]() |
The object is quite tall relative to the width compared to other chairs and so the network predicted a lamp rather than a vase |
| Vase | Lamp | ![]() |
The object here has a small base and a big head, which looks more like a lamp than a vase |
| Lamp | Vase | ![]() |
The object has a box like shape and so it could be predicted like a vase rather than a lamp |
2. Segmentation Model¶
Test Accuracy on best model: 90.78%
Visualize segmentation results of at least 5 objects (including 2 bad predictions) with corresponding ground truth, report the prediction accuracy for each object, and provide interpretation in a few sentences.
| Object Index | Prediction accuracy | Prediction | Ground Truth | Interpretation |
|---|---|---|---|---|
| 562 | 99.77% | ![]() |
![]() |
The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest |
| 397 | 99.67% | ![]() |
![]() |
The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest |
| 505 | 99.58% | ![]() |
![]() |
The shape of the chair is pretty close to a standard chair thus it is easy to segment out the cushion, legs and backrest |
| 163 | 43.65% | ![]() |
![]() |
The shape of the chair is not as similar to the standard chairs and also has a head rest adn leg rest. Thus, it is more difficult for the model to predict the segmentation correctly. It also seems that the model assumes that flat horizontal planes tend to correlate to the cushion segmentation as even the leg rest is segmented |
| 255 | 47.10% | ![]() |
![]() |
The model correctly segmented some parts of the arm rest, back rest and cushion but found it challenging to segment out other parts of the legs, cushion and arm rest. This is because the bottom half of the pointcloud is just a box in shape and it can be hard to disambuguiate between different parts when the geometry has minimal discontinuities |
| 96 | 50.03% | ![]() |
![]() |
Similar to the above, the model correctly segmented some parts of the arm rest, back rest and cushion but found it challenging to segment out other parts of the legs, cushion and arm rest. This is because the bottom half of the pointcloud is just a box in shape and it can be hard to disambuguiate between different parts when the geometry has minimal discontinuities. The pillow is also hard to segment out as it is not very common in the dataset. |
3. Robustness Analysis¶
Rotation¶
| Pitch Angle | Classification Accuracy | Segmentation Accuracy | Prediction | Ground Truth |
|---|---|---|---|---|
| 0 | 97.06% (baseline) | 90.78% (baseline) | ![]() |
![]() |
| 15 | 96.22% (-0.84%) | 87.01% (-3.77%) | ![]() |
![]() |
| 30 | 92.02% (-5.04%) | 78.44% (-12.34%) | ![]() |
![]() |
| 45 | 72.61% (-24.45%) | 70.86% (-19.92%) | ![]() |
![]() |
| 60 | 62.11% (-34.95%) | 63.70% (-27.08%) | ![]() |
![]() |
| 90 | 67.15% (-29.91%) | 57.84% (-32.94%) | ![]() |
![]() |
| 180 | 32.63% (-64.43%) | 36.47% (-54.31%) | ![]() |
![]() |
From the experiments, we can see that our network is not robust to yaw rotation. The rotated pointclouds still represent the same object in a plausible pose, but we can see a large pdeccrease in performance with a 64.43% decrease in classification performance and 54.31% decrease in segmentation performance for a 180 degree yaw rotation. This is likely because the dataset has been canonicalized and simplified for point cloud prediction, thus the network is not robust to rotations. Even though a TNet was included for segmentation, it seems that we still need data augmentation to have the network learn more robust features to handle transformations.
Number of Points¶
| Number of Points | Classification Accuracy | Segmentation Accuracy | Prediction | Ground Truth |
|---|---|---|---|---|
| 10000 | 97.06% (baseline) | 90.78% (baseline) | ![]() |
![]() |
| 5000 | 96.53% (-0.53%) | 90.75% (-0.03%) | ![]() |
![]() |
| 2000 | 96.53% (-0.53%) | 90.57% (-0.21%) | ![]() |
![]() |
| 1000 | 96.22% (-0.84%) | 89.67% (-1.11%) | ![]() |
![]() |
| 500 | 95.91% (-1.15%) | 88.40% (-2.38%) | ![]() |
![]() |
| 100 | 93.39% (-3.67%) | 81.73% (-9.05%) | ![]() |
![]() |
It seems that the networks are generally robust to a decrease in number of point samples with performance degrading 0.84% for classification tasks and 1.11% for segmentation tasks when the number of points were decreased by 10x to 1000 points. We can also see that the number of points affects segmentation more with lesser point samples in more extreme cases. When the number of point samples is decreased 100x to 100 points, classification performance drops 3.67% and segmentation performance drops 9.05%. This is likely because the local point features matter more for segmentation than classification.





























