Q1. Classification Model (40 points)¶

Best Accuracy Score: 98.22%

For better visualization, here I use the following mapping rule to identify the predicted class:

  • Red = Chair
  • Blue = Lamp
  • Green = Vase

Success Cases¶

Chair Lamp Vase
No description has been provided for this image No description has been provided for this image No description has been provided for this image
No description has been provided for this image No description has been provided for this image No description has been provided for this image
No description has been provided for this image No description has been provided for this image No description has been provided for this image
No description has been provided for this image No description has been provided for this image No description has been provided for this image
No description has been provided for this image No description has been provided for this image No description has been provided for this image

Failure Cases¶

  • Chair Interpretation: The trained model did not fail on any chair instance.
Visualization Ground Truth Prediction
No description has been provided for this image Lamp Vase
No description has been provided for this image Lamp Vase
No description has been provided for this image Lamp Vase
  • Lamp interpretation: Many of the lamp samples misclassified as vases have a smooth, cylindrical appearance with fewer protrusions, which confuses the model given the shape similarity to vases.
Visualization Ground Truth Prediction
No description has been provided for this image Vase Lamp
No description has been provided for this image Vase Lamp
No description has been provided for this image Vase Chair
Vase interpretation: These vases have ambiguous shapes, like the flower stalks. Specifically in the third one, the shape makes it look like a chair. It is even hard for human to identify them.

Q2. Segmentation Model (40 points)¶

Best Accuracy Score: 90.24%

Good Cases¶

Ground truth segmentation Predicted segmentation Accuracy
No description has been provided for this image No description has been provided for this image 99%
No description has been provided for this image No description has been provided for this image 99%
No description has been provided for this image No description has been provided for this image 99%

Bad Cases¶

Ground truth segmentation Predicted segmentation Accuracy Interpretation
No description has been provided for this image No description has been provided for this image 51% The model confuses neighboring regions—yellow armrest and side-panel areas are incorrectly split into red and blue. Backrest and seat predictions are partially correct but suffer from noisy boundaries, reducing accuracy. These artifacts suggest the model has difficulty with overlapping structures.
No description has been provided for this image No description has been provided for this image 56% The model captures major regions but produces poorly defined boundaries. Side and back panels are confused, with label bleeding across segments, indicating difficulty handling spatial discontinuities and geometrically similar parts.

Q3. Robustness Analysis (20 points)¶

Rotation Around Z Axis¶

Procedure: We evaluated the classification and segmentation performance by rotating each input point cloud around the z-axis, from 0° to 90°, including 0°, 5°, 45° and 90°. This was done to test the model's sensitivity to orientation, which is important as PointNet is not inherently rotation-invariant.

Comparison: For the classification task, its performance drops from 98.22% to 22.25% after changing rotation degree from 0 to 90. For the segmentation task, its performance drops from 90.24% to 38.61% after changing rotation degree from 0 to 90.

Interpretation: For the classification task, the model tends to misclassify all objects as the dominant class under heavy rotation, suggesting limited rotational generalization. For the segmentation task, the model overfits to canonical orientations and lacks robust feature extraction for rotated geometries.

Classification¶

Rotation Degree Accuracy Ground Truth Chair (RED) Ground Truth Lamp (BLUE) Ground Truth Vase (GREEN) Observation
0 98.22% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase
5 97.27% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase
45 52.36% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Vase
  • GT: Vase, Pred: Lamp
90 22.25% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Vase
  • GT: Lamp, Pred: Vase
  • GT: Vase, Pred: Lamp

Segmentation¶

object_id=0

Rotation Degree Accuracy Original GT Original Pred Transformed Pred
0 94.29% No description has been provided for this image No description has been provided for this image No description has been provided for this image
5 94.83% No description has been provided for this image No description has been provided for this image No description has been provided for this image
45 75.25% No description has been provided for this image No description has been provided for this image No description has been provided for this image
90 40.28% No description has been provided for this image No description has been provided for this image No description has been provided for this image

object_id=50

Rotation Degree Accuracy Original GT Original Pred Transformed Pred
0 90.22% No description has been provided for this image No description has been provided for this image No description has been provided for this image
5 90.65% No description has been provided for this image No description has been provided for this image No description has been provided for this image
45 69.24% No description has been provided for this image No description has been provided for this image No description has been provided for this image
90 35.86% No description has been provided for this image No description has been provided for this image No description has been provided for this image

object_id=100

Rotation Degree Accuracy Original GT Original Pred Transformed Pred
0 90.22% No description has been provided for this image No description has been provided for this image No description has been provided for this image
5 90.65% No description has been provided for this image No description has been provided for this image No description has been provided for this image
45 62.38% No description has been provided for this image No description has been provided for this image No description has been provided for this image
90 34.32% No description has been provided for this image No description has been provided for this image No description has been provided for this image

Number of Points¶

Procedure: We varied the number of input points during evaluation using the --num_points, including 10000, 1000, 100 and 10. Point clouds were downsampled to simulate sparse input conditions.

Comparison: For the classification task, its performance drops from 98.22% to 26.02% after changing the number of points from 10,000 to 10. For the segmentation task, its performance drops from 90.24% to 68.33% after changing the number of points from 10,000 to 10.

Interpretation: For both the classification and segmentation task, it indicates that classification is relatively robust as long as over 1000 points are present.

Classification¶

Points# Accuracy Ground Truth Chair (RED) Ground Truth Lamp (BLUE) Ground Truth Vase (GREEN) Observation
10000 98.22% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase
1000 97.59% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase
100 92.03% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase
10 26.02% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Lamp
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Lamp

Segmentation¶

object_id=0

Points# Accuracy Original GT Original Pred Transformed Pred
10000 94.29% No description has been provided for this image No description has been provided for this image No description has been provided for this image
1000 95.10% No description has been provided for this image No description has been provided for this image No description has been provided for this image
100 98.00% No description has been provided for this image No description has been provided for this image No description has been provided for this image
10 70.00% No description has been provided for this image No description has been provided for this image No description has been provided for this image

object_id=50

Points# Accuracy Original GT Original Pred Transformed Pred
10000 94.29% No description has been provided for this image No description has been provided for this image No description has been provided for this image
1000 92.00% No description has been provided for this image No description has been provided for this image No description has been provided for this image
100 91.00% No description has been provided for this image No description has been provided for this image No description has been provided for this image
10 90.00% No description has been provided for this image No description has been provided for this image No description has been provided for this image

object_id=100

Points# Accuracy Original GT Original Pred Transformed Pred
10000 94.29% No description has been provided for this image No description has been provided for this image No description has been provided for this image
1000 94.60% No description has been provided for this image No description has been provided for this image No description has been provided for this image
100 72.00% No description has been provided for this image No description has been provided for this image No description has been provided for this image
10 60.00% No description has been provided for this image No description has been provided for this image No description has been provided for this image

Q4. Bonus Question - Locality (20 points)¶

Classification¶

Best Accuracy Score¶

  • PointNet: 98.22%
  • PointNet++: 98.95%
  • Red = Chair
  • Blue = Lamp
  • Green = Vase
GT Class PointNet Prediction PointNet++ Prediction
Vase No description has been provided for this image No description has been provided for this image
Vase No description has been provided for this image No description has been provided for this image
Lamp No description has been provided for this image No description has been provided for this image
Lamp No description has been provided for this image No description has been provided for this image

Rotation

Network Type Rotation Degree Accuracy Ground Truth Chair (RED) Ground Truth Lamp (BLUE) Ground Truth Vase (GREEN) Observation
PointNet 45 52.36% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Vase
  • GT: Vase, Pred: Lamp
PointNet++ 45 80.06% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase
PointNet 90 22.25% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Vase
  • GT: Lamp, Pred: Vase
  • GT: Vase, Pred: Lamp
PointNet++ 90 50.05% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Lamp
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase

Number of Point

Network Type Points# Accuracy Ground Truth Chair (RED) Ground Truth Lamp (BLUE) Ground Truth Vase (GREEN) Observation
PointNet 1000 97.59% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase
PointNet++ 1000 91.40% No description has been provided for this image No description has been provided for this image No description has been provided for this image
  • GT: Chair, Pred: Chair
  • GT: Lamp, Pred: Lamp
  • GT: Vase, Pred: Vase

Interpretation:

PointNet++ demonstrates significant improvements over PointNet in several key aspects, while also revealing some trade-offs:

Overall Accuracy: PointNet++ achieves a slightly higher accuracy (98.95% vs 98.22%) on the standard test set, indicating that the hierarchical feature extraction with locality awareness helps capture more discriminative features. The hierarchical sampling and grouping mechanism allows the model to learn multi-scale features, which can better distinguish subtle differences between classes.

Rotation Robustness: PointNet++ shows dramatically better performance under rotation transformations. At 45° rotation, PointNet++ maintains 80.06% accuracy compared to PointNet's 52.36%, and at 90° rotation, PointNet++ achieves 50.05% versus PointNet's 22.25%. This improvement can be attributed to PointNet++'s local neighborhood processing: by aggregating features within local regions at multiple scales, the model learns more rotation-invariant representations. The hierarchical structure allows the network to capture geometric relationships that are less dependent on absolute orientation, whereas PointNet's global max-pooling is more sensitive to point cloud orientation.

Point Density Sensitivity: Interestingly, PointNet++ shows slightly lower performance (91.40% vs 97.59%) when evaluated with 1000 points compared to PointNet. This suggests that PointNet++'s hierarchical sampling strategy requires sufficient point density to effectively form local neighborhoods. With fewer points, the farthest point sampling and ball query operations may not capture representative local structures, leading to degraded performance. PointNet's simpler architecture, which directly processes all points, is more robust to moderate point density reductions.

Conclusion: PointNet++'s hierarchical architecture with locality provides substantial benefits for rotation robustness and overall accuracy, making it more suitable for real-world applications where objects may appear in various orientations. However, it requires sufficient point density to leverage its hierarchical sampling effectively, which is an important consideration for sparse point cloud scenarios.

Segmentation¶

Best Accuracy Score¶

  • PointNet: 90.24%
  • PointNet++: 91.97%
Network Type Accuracy Ground Truth Prediction
PointNet 94.29% No description has been provided for this image No description has been provided for this image
PointNet++ 95.61% No description has been provided for this image No description has been provided for this image

Rotation

Network Type Rotation Degree Accuracy Ground Truth Prediction
PointNet 45 75.25% No description has been provided for this image No description has been provided for this image
PointNet++ 45 72.86% No description has been provided for this image No description has been provided for this image
PointNet 90 40.28% No description has been provided for this image No description has been provided for this image
PointNet++ 90 42.16% No description has been provided for this image No description has been provided for this image

Number of Points

Network Type Points# Accuracy Ground Truth Prediction
PointNet 1000 95.10% No description has been provided for this image No description has been provided for this image
PointNet++ 1000 95.70% No description has been provided for this image No description has been provided for this image

Segmentation Rotation Robustness Interpretation:

Unlike classification, PointNet++ segmentation does not show significant improvement over PointNet under large rotations (45°: 72.86% vs 75.25%, 90°: 42.16% vs 40.28%).

Conclusion: The hierarchical architecture that makes PointNet++ superior for classification (where a single global representation is needed) becomes a limitation for segmentation under rotation, where precise per-point spatial alignment is critical. This highlights that architectural choices must be carefully considered for the specific task and robustness requirements.