Name: Ishita Gupta
Andrew ID: ishitag
97.38%Sample 1 - Chair (Correct)

Sample 2 - Chair (Correct)

Sample 3 - Vase (Correct)

Sample 4 - Vase (Correct)

Sample 5 - Lamp (Correct)

Sample 6 - Lamp (Correct)

Failure 1 - Chair misclassified as Lamp

Failure 2 - Vase misclassified as Lamp

Failure 3 - Lamp misclassified as Vase

The model achieves excellent performance on chair classification (99.84%) due to possibly their distinctive features like four legs and backrests. However, vases and lamps are more challenging as they share similar cylindrical geometries. The failure cases show confusion primarily between vases and lamps, which can have similar elongated vertical structures. The chair that was misclassified as a lamp likely has an unusual design (collapsed) with less prominent leg structures as compared to generic chair examples. Random point sampling may miss distinctive features in some cases, especially for objects with complex or sparse geometries.
Test accuracy: 90.05%
Visualize segmentation results of at least 5 objects (including 2 bad predictions) with corresponding ground truth, report the prediction accuracy for each object, and provide interpretation in a few sentences.
| Object | Ground Truth | Prediction | Accuracy |
|---|---|---|---|
| Object 0 (Good) | ![]() |
![]() |
94.25% |
| Object 4 (Good) | ![]() |
![]() |
73.80% |
| Object 57 (Good) | ![]() |
![]() |
99.07% |
| Object 616 (Good) | ![]() |
![]() |
99.35% |
| Object 351 (Bad) | ![]() |
![]() |
51.66% |
| Object 40 (Bad) | ![]() |
![]() |
53.37% |
Evaluated robustness to rotations by applying rotations of varying angles (15, 30, 45, 90, 180) degrees around the z-axis to test point clouds. Also tested rotations around x and y axes at 45degrees to understand axis-specific sensitivity. The PointNet architecture without T-Net transformation blocks is expected to be sensitive to rotations since it processes raw point coordinates directly.
| Rotation Angle | Axis | Test Accuracy | Accuracy Drop |
|---|---|---|---|
| 0deg (baseline) | z | 97.38% | - |
| 15deg | z | 91.92% | -5.46% |
| 30deg | z | 56.24% | -41.14% |
| 45deg | z | 24.87% | -72.51% |
| 90deg | z | 24.24% | -73.14% |
| 180deg | z | 53.31% | -44.07% |
| 45deg | x | 49.84% | -47.54% |
| 45deg | y | 63.06% | -34.32% |
| Rotation Angle | Axis | Test Accuracy | Accuracy Drop |
|---|---|---|---|
| 0deg (baseline) | z | 90.05% | - |
| 15deg | z | 83.11% | -6.94% |
| 30deg | z | 70.31% | -19.74% |
| 45deg | z | 59.36% | -30.69% |
| 90deg | z | 43.02% | -47.03% |
Visualization:
| Rotation Angle (degrees) | GT | Pred |
|---|---|---|
| 0 | ![]() |
![]() |
| 45 | ![]() |
![]() |
| 90 | ![]() |
![]() |
Approach:
| Number of Points | Test Accuracy | vs. Baseline (97.38%) |
|---|---|---|
| 50 | 65.37% | -32.01% |
| 100 | 89.72% | -7.66% |
| 500 | 97.69% | +0.31% |
| 1000 | 97.80% | +0.42% |
| 2500 | 98.11% | +0.73% |
| 5000 | 98.11% | +0.73% |
| 10000 (baseline) | 97.38% | 0% |
| Number of Points | Test Accuracy | vs. Baseline (90.05%) |
|---|---|---|
| 50 | 79.28% | -10.77% |
| 100 | 83.23% | -6.82% |
| 500 | 89.28% | -0.77% |
| 1000 | 90.26% | +0.21% |
| 2500 | 90.48% | +0.43% |
| 5000 | 90.46% | +0.41% |
| 10000 (baseline) | 90.05% | 0% |
| Num points | GT | Pred |
|---|---|---|
| 50 | ![]() |
![]() |
| 100 | ![]() |
![]() |
| 500 | ![]() |
![]() |
| 1000 | ![]() |
![]() |
DGCNN (Dynamic Graph CNN) - A locality-aware architecture that builds dynamic k-NN graphs and applies edge convolutions to capture local geometric features.
Key differences from PointNet:
| Model | Test Accuracy | Improvement |
|---|---|---|
| PointNet (Q1) | 97.38% | Baseline |
| DGCNN (Q4) | 97.69% | +0.31% |
| Class | PointNet (Q1) | DGCNN (Q4) | Improvement |
|---|---|---|---|
| Chair | 616/617 (99.84%) | 616/617 (99.84%) | 0% |
| Vase | 94/102 (92.16%) | 90/102 (88.24%) | -3.92% |
| Lamp | 218/234 (93.16%) | 225/234 (96.15%) | +2.99% |
Sample 1 - Chair (Correct)

Sample 2 - Chair (Correct)

Sample 3 - Vase (Correct)

Sample 4 - Vase (Correct)

Sample 5 - Lamp (Correct)

Sample 6 - Lamp (Correct)

Failure 1 - Chair misclassified as Lamp

Failure 2 - Vase misclassified as Lamp

Failure 3 - Lamp misclassified as Vase

Segmentation model training for DGCNN was not completed due to computational constraints. The DGCNN architecture requires significantly more memory due to k-NN graph computation, making it challenging to train the segmentation model with the available resources.
DGCNN shows a modest improvement in overall classification accuracy (+0.31%) compared to PointNet. The most notable improvement is in lamp classification, where DGCNN achieves 96.15% accuracy compared to PointNet's 93.16% (+2.99%). This suggests that the local neighborhood features captured by EdgeConv layers help distinguish lamp structures, which often have complex local geometric patterns.
However, DGCNN shows a slight decrease in vase classification accuracy (88.24%, -3.92%), which may indicate that for simpler geometric shapes like vases, the additional complexity of k-NN graph construction doesn't provide significant benefits and may even introduce noise.