Assignment 5

PointNet for Point Cloud Classification and Segmentation

Q1: Classification Model 40 points

Test Accuracy: 97.90%

Random Test Point Clouds with Predictions

Below are visualizations of random test point clouds with their predicted classes.

Sample 1 - Predicted: Chair, Actual: Chair ✓

Sample 2 - Predicted: Vase, Actual: Vase ✓

Sample 3 - Predicted: Lamp, Actual: Lamp ✓

Sample 4 - Predicted: Vase, Actual: Vase ✓

Overall Performance: The model gets 97.90% test accuracy, showing it's good classification capability across all three object categories (chairs, vases, and lamps). The successful predictions shown above represent typical cases where the model correctly identifies distinctive features of the classes.

Failure Cases Analysis

Chair Failure (Predicted: Vase)

Predicted: Vase ✗, Actual: Chair

Interpretation: The model misclassified this chair as a vase, likely due to the chair having a rounded or cylindrical backrest that resembles the profile of a vase. The chair may have an unconventional design with smooth curved surfaces and minimal angular features typical of chairs. This suggests the model may be relying heavily on overall silhouette and curvature patterns rather than discriminative structural details like distinct legs, armrests, or a flat seat surface. The failure highlights that when geometric ambiguity exists, PointNet's global feature aggregation may not capture sufficient local structural context to distinguish between similar overall shapes.

Vase Failure (Predicted: Lamp)

Predicted: Lamp ✗, Actual: Vase

Interpretation: This vase was incorrectly classified as a lamp. The elongated vertical shape with a wider base may have resembled a lamp stand, particularly if the vase has decorative elements at the top that could be confused with a lampshade or has a tall narrow neck similar to a lamp post. Both vases and lamps can share similar vertical, axially-symmetric structures, making them challenging to distinguish based solely on global geometric features. This error indicates that the model may need more discriminative features to differentiate between objects with similar aspect ratios and vertical alignment but different functional semantics.

Lamp Failure (Predicted: Vase)

Predicted: Vase ✗, Actual: Lamp

Interpretation: The model confused this lamp with a vase, possibly because the lamp has a simple columnar design without a prominent lampshade, or the lampshade's shape closely resembles a vase's opening or body. Minimalist lamp designs with smooth surfaces and symmetric profiles can exhibit geometric properties nearly identical to decorative vases. This misclassification reveals the model's limitation in handling objects with ambiguous geometric features where functional context cannot be gotten from shape alone. The error suggests that pure geometry-based classification may be insufficient for objects with overlapping form factors.

Summary of Failure Patterns: The failure cases reveal a consistent pattern: the model struggles most with objects that share geometric similarity across categories, particularly when dealing with smooth, curved, or axially-symmetric shapes. The confusion between vases and lamps and between chairs and vases indicates that PointNet's global max-pooling may be capturing high-level shape descriptors but missing fine-grained structural details that would provide class-specific discriminative power. These errors account for the ~2% of test samples where the model fails, suggesting that data augmentation with rotation and scaling, or architectural improvements incorporating local context could improve geeralization to geometrically ambiguous cases.

Q2: Segmentation Model 40 points

Test Accuracy: 90.40%

Prediction

Analysis: Significant segmentation errors with multiple misclassified parts. The model fails to capture the correct part structure, with entire structural components potentially mislabeled (e.g., backrest points classified as seat, or leg points confused with armrests). This failure could be caused by: (1) unusual or complex chair geometry that is underrepresented in the training data, (2) occlusion-like effects in the point cloud sampling where certain parts have very sparse point coverage, or (3) geometric ambiguity where parts merge smoothly without clear boundaries. The widespread nature of the errors (not just boundary confusion) suggests the global context feature may be misleading for this shape, causing the model to apply an incorrect part labeling scheme. This highlights the fundamental challenge in part segmentation: the need to balance local geometric detail with global structural understanding, which the basic PointNet architecture struggles with for atypical or geo

Q3: Robustness Analysis 20 points

Experiment 1: Rotation Robustness

Procedure: I tested the model's robustness to rotation by rotating input point clouds around the X-axis at various angles (0°, 15°, 30°, 45°, 60°, 75°, 90°). Each test point cloud was transformed using a rotation matrix before being fed to the model. This tests whether the model has learned rotation-invariant features or if it relies on canonical orientations seen during training.

Classification Task Results

Baseline Test Accuracy (0°): 97.90%
Test Accuracy at 90°: 27.70%
Comparison with Q1: -70.20% accuracy drop at 90° rotation

Detailed Results:

Rotation	Accuracy	Change
0°	97.90%	+0.00%
15°	94.54%	-3.36%
30°	82.79%	-15.11%
45°	54.88%	-43.02%
60°	33.26%	-64.64%
75°	28.12%	-69.78%
90°	27.70%	-70.20%

Classification Visualizations

Comparison of classification performance at 0° (baseline) and 90° rotation:

Q1 Original - 0° (97.90%)

Rotated 90° (27.70%)

Segmentation Task Results

Baseline Test Accuracy (0°): 90.40%
Test Accuracy at 90°: 24.13%
Comparison with Q2: -66.27% accuracy drop at 90° rotation

Detailed Results:

Rotation	Accuracy	Change
0°	90.40%	+0.00%
15°	83.00%	-7.40%
30°	72.62%	-17.78%
45°	63.99%	-26.41%
60°	45.00%	-45.40%
75°	30.84%	-59.55%
90°	24.13%	-66.27%

Segmentation Visualizations - Baseline (0°)

Q2 Original - Ground Truth

Q2 Original - Prediction (90.40%)

Segmentation Visualizations - 0° Rotation

0° - Ground Truth

0° - Prediction (90.40%)

Segmentation Visualizations - 45° Rotation

45° - Ground Truth

45° - Prediction (63.99%)

Segmentation Visualizations - 90° Rotation

90° - Ground Truth

90° - Prediction (24.13%)

Interpretation: Both models show HIGH sensitivity to rotation, with severe accuracy degradation at larger rotation angles. Classification accuracy drops from 97.90% to 27.70% (70.20% drop), while segmentation drops from 90.40% to 24.13% (66.27% drop). Both tasks exhibit similar vulnerability to rotation, indicating that neither the classification nor segmentation network has learned rotation-invariant features. Even small rotations cause noticeable degradation (15°: -3.36% for classification, -7.40% for segmentation), with performance collapsing beyond 45°. This reveals that the models have memorized canonical orientations from training rather than learning geometric properties invariant to rotation. The similar degradation patterns across both tasks suggest this is a fundamental limitation of the basic PointNet architecture without T-Net or data augmentation. The visualizations clearly show how the segmentation quality deteriorates as rotation increases, with part boundaries becoming increasingly confused at 45° and nearly random at 90°.

Experiment 2: Point Density Robustness

Procedure: I tested the model's robustness to varying point density by randomly sampling different numbers of points from each object (100, 500, 1000, 2500, 5000, 7500, 10000 points). This evaluates whether the model can maintain performance with sparse point clouds and how much geometric information is truly needed for accurate classification.

Classification Task Results

Baseline Test Accuracy (10000 points): 97.90%
Test Accuracy at 100 points: 92.76%
Comparison with Q1: Only -5.14% accuracy drop with 100x fewer points

Detailed Results:

# Points	Accuracy	Change
100	92.76%	-5.14%
500	96.85%	-1.05%
1000	97.80%	-0.10%
2500	98.01%	+0.11%
5000	98.01%	+0.11%
7500	97.80%	-0.10%
10000	97.90%	0.00%

Classification Visualizations

Q1 Original (10000 points - 97.90%)

100 Points (92.76%)

Segmentation Task Results

Baseline Test Accuracy (10000 points): 90.40%
Test Accuracy at 100 points: 83.92%
Comparison with Q2: -6.48% accuracy drop with 100x fewer points

Detailed Results:

# Points	Accuracy	Change
100	83.92%	-6.48%
500	88.81%	-1.59%
1000	89.83%	-0.57%
2500	90.31%	-0.09%
5000	90.37%	-0.03%
7500	90.43%	+0.03%
10000	90.40%	0.00%

Segmentation Visualizations - 10000 Points (Baseline)

10000 Points - Ground Truth

10000 Points - Prediction (90.40%)

Segmentation Visualizations - 5000 Points

5000 Points - Ground Truth

5000 Points - Prediction (90.37%)

Segmentation Visualizations - 1000 Points

1000 Points - Ground Truth

1000 Points - Prediction (89.83%)

Segmentation Visualizations - 100 Points

100 Points - Ground Truth

100 Points - Prediction (83.92%)

Interpretation: Both models demonstrate excellent robustness to reduced point density, in stark contrast to their rotation sensitivity. Classification maintains 92.76% accuracy with just 100 points (5.14% drop), while segmentation achieves 83.92% (6.48% drop). The slightly larger degradation in segmentation is expected since it requires more local geometric detail for per-point labeling. Performance plateaus around 500 points for classification (96.85%) and 1000 points for segmentation (89.83%), showing that PointNet efficiently extracts salient geometric features without requiring dense sampling. This robustness stems from the symmetric max-pooling aggregation function, which is invariant to point set size and focuses on the most discriminative features. The visualizations demonstrate that even with dramatically reduced point density (100 points vs 10000), the model maintains coherent segmentation with only minor quality degradation at part boundaries. This confirms that PointNet successfully learns compact shape representations that generalize across different sampling densities, making it practical for real-world applications with varying sensor quality and point cloud resolution.