Q1. Classification Model (40 points)¶
Test Accuracy: 97.8%
Max Epochs: 60
| Chair | Vase | Lamp |
|---|---|---|
![]() |
![]() |
![]() |
As we can see, the model correctly classifies point clouds that have distinctive features identifiable for a particular class. This is also evident from the accuracy score.
Failure Cases
| Chair | Vase | Lamp |
|---|---|---|
Pred: Vase. Conf = 0.84 |
Pred:Lamp. Conf = 0.615 |
Pred: Chair. Conf: 0.584 |
The failure cases have a slight distribution shift from what the characteristic features are for each class. For example, chairs generally have 4 legs. The mis-classified example shows only two legs for a chair. The vase is generally cylindrical in structure and thus the misclassification is somewhat understood. The lamp is also different from the standard lamp.
Q2. Segmentation Model (40 points)¶
Test accuracy: 88.57%
Max Epochs: 60
| Accuracy | Ground Truth | Predicted |
|---|---|---|
| 99.47% | ![]() |
![]() |
| 98.77% | ![]() |
![]() |
| 98.55% | ![]() |
![]() |
| 42.43% | ![]() |
![]() |
| 45.43% | ![]() |
![]() |
| 47.18% | ![]() |
![]() |
From the above results, we can see that for the chairs that have distinctive segmentation of different parts clearly visible, such as the legs, backrest, seat, the model is able to do a very good job on segmenting those. But for the failure cases, it is unclear on what the chair exaclty is. For some cases, the model predicts a large blue base (legs) whereas its extent varies in the ground truth. The grount truth point cloud also has some noise, causing a decrease in the predicted accuracy.
Q3. Robustness Analysis (20 points)¶
CLS¶
Robustness analysis to rotation¶
| Degree | Ground Truth (chair) | Pred |
|---|---|---|
| 0 | ![]() |
chair |
| 30 | ![]() |
chair |
| 60 | ![]() |
chair |
| 90 | ![]() |
vase |
| 120 | ![]() |
vase |
| 150 | ![]() |
lamp |
| 180 | ![]() |
lamp |

From the above table and visualizations, we can see that if we rotate the object around z-axis, then the model accuracy decreases as the rotation angle increases.
Robustness analysis to number of points¶
| Number of Points | Pred for all: chair |
|---|---|
| 10000 | ![]() |
| 5000 | ![]() |
| 2000 | ![]() |
| 1000 | ![]() |
| 500 | ![]() |
| 100 | ![]() |

From the above table and visualizations, we can see that if we vary the number of points in the point cloud, then the model accuracy deviates very little. This shows the robustness of the PointNet architecture to the number of input points due to the max pooling operation.
SEG¶
Robustness analysis to rotation¶
| Degree | Ground Truth | Pred |
|---|---|---|
| 0 | ![]() |
![]() |
| 30 | ![]() |
![]() |
| 60 | ![]() |
![]() |
| 90 | ![]() |
![]() |
| 120 | ![]() |
![]() |
| 150 | ![]() |
![]() |
| 180 | ![]() |
![]() |

From the above table and visualizations, we can see that if we rotate the object around z-axis, then the model accuracy for the segmentation task decreases as well, as the rotation angle increases.
Robustness analysis to number of points¶
| Points | Ground Truth | Pred |
|---|---|---|
| 10000 | ![]() |
![]() |
| 5000 | ![]() |
![]() |
| 2000 | ![]() |
![]() |
| 1000 | ![]() |
![]() |
| 500 | ![]() |
![]() |
| 100 | ![]() |
![]() |

From the above table and visualizations, we can see that if we vary the number of points in the point cloud, then the model accuracy for segmentation deviates very little, similar to the classification case. This shows the robustness of the PointNet architecture to the number of input points due to the max pooling operation.
Q4. Bonus Question - Locality (20 points)¶
I tried to implement pointnet++ with the sampling and grouping layers implemented as SetAbstraction, and FeaturePropagation layers for upsamping the points downsampled by SetAbstraction for segmentation tasks. The MSG and MRG haven't been added since the dataset contains simplistic objects. Local neighborhood selection is done using Ball Query.
CLS¶
Accuracy: 97.9
0.1 % improvement over the vanilla pointnet architecture. The pointnet++ model converged faster as well (in 20 epoch compared to 60 epochs for vanilla).
Here are the visualizations for the random and worst cases:
| Chair | Vase | Lamp |
|---|---|---|
![]() |
pred: lamp |
![]() |
We can see that there is a GT labeling error.
Failure cases:
| Chair | Vase | Lamp |
|---|---|---|
Pred: Vase. Conf = 0.87 |
Pred:Lamp. Conf = 0.66 |
Pred: Vase. Conf: 0.83 |
Seg¶
Overall Test Accuracy: 90.54% which is a 2% improvement over the vanilla pointnet model for segmentation
Samples: Best and worst
| Accuracy | Ground Truth | Predicted |
|---|---|---|
| 99.61% | ![]() |
![]() |
| 98.60% | ![]() |
![]() |
| 99.60% | ![]() |
![]() |
| 29.88% | ![]() |
![]() |
| 41.51% | ![]() |
![]() |
| 45.46% | ![]() |
![]() |
The failure cases are hard samples of the respective classes.






















































pred: lamp











