16-825: Learning for 3D Vision — Assignment 5

Manyung Emma Hon · mehon · Fall 2025

Q1. Classification Model

Overall Test Accuracy: 98.32%

Per-Class Performance

Class Accuracy
Chair 99.84%
Vase 91.18%
Lamp 97.44%

Correct Predictions

Correct chair prediction
Chair prediction class of chair
Correct vase prediction
Vase prediction class of vase
Correct lamp prediction
Lamp prediction class of lamp

Failure Cases

Visualization True Class Predicted Class Analysis
Chair misclassified as lamp Chair Lamp This chair has a vertical structure with thin legs and a tall back, which may resemble lamp-like features to the model.
Vase misclassified as lamp Vase Lamp The narrow shape of this vase creates ambiguity with lamp structures, particularly given the vertical symmetry.
Lamp misclassified as vase Lamp Vase This lamp has a wide base that shares geometric similarities with vase shapes, confusing the classifier.

Interpretation

The PointNet classification model achieves great overall performance (98.32%), with particularly strong results on chairs (99.84%). The model struggles slightly more with vases (91.18%), likely because of their high shape variability and similarity to lamps.

Common failure modes include:

Q2. Segmentation Model

Overall Test Accuracy: 90.25%

Challenging Cases

Object 26 ground truth
Object 26 - Ground Truth
Object 26 prediction
Object 26 - Prediction
Accuracy: 44.40%

Analysis: This chair has complex, difficult to distinguish structures where the model struggles to maintain consistent part boundaries.

Object 351 ground truth
Object 351 - Ground Truth
Object 351 prediction
Object 351 - Prediction
Accuracy: 45.67%

Analysis: Ambiguous part boundaries and uneven point density contribute to segmentation errors.

High-Quality Segmentations

Object 397 ground truth
Object 397 - Ground Truth
Object 397 prediction
Object 397 - Prediction
Accuracy: 99.37%
Object 600 ground truth
Object 600 - Ground Truth
Object 600 prediction
Object 600 - Prediction
Accuracy: 99.43%
Object 471 ground truth
Object 471 - Ground Truth
Object 471 prediction
Object 471 - Prediction
Accuracy: 99.62%

Interpretation

The segmentation model achieves strong overall performance (90.25%), with a clear distinction between easy and challenging cases. Best predictions (99%+ accuracy) occur on chairs with clear geometric boundaries between parts. The model successfully segments well-separated components.

Challenging predictions (44-46% accuracy) reveal common difficulties:

Q3. Robustness Analysis

Experiment 1: Rotation Robustness

Procedure: Rotated point clouds around the z-axis by varying degrees (0, 45, 90) and evaluated segmentation accuracy.

Results - Segmentation Task

Rotation Angle Accuracy Change from 0 degree
0 (Baseline) 90.25% -
45 63.19% -27.06%
90 38.12% -52.13%

Visual Comparison

0 degree rotation
0 rotation
Acc: 90.25%
45 degree rotation
45 rotation
Acc: 63.19%
90 degree rotation
90 rotation
Acc: 38.12%

Interpretation

The model shows significant degradation with rotation, losing over 50% accuracy at 90 rotation. This indicates the model has learned orientation-dependent features rather than rotation-invariant representations.

Potential improvements: Adding data augmentation with random rotations during training, or using rotation-invariant features (e.g., local reference frames, DGCNN's edge convolutions).

Experiment 2: Point Density Robustness

Procedure: Varied the number of points per object (500, 2,500, 10,000) to test how point cloud sparsity affects segmentation performance.

Results - Segmentation Task

Number of Points Accuracy Change from 10,000
10,000 (Baseline) 90.25% -
2,500 90.22% -0.03% (negligible)
500 89.21% -1.04%

Visual Comparison

500 points
500 points
Acc: 89.21%
2500 points
2,500 points
Acc: 90.22%
10000 points
10,000 points
Acc: 90.25%

Interpretation

Key Findings:

Contrast with Experiment 1: Unlike rotation (58% accuracy drop at 90), point density reduction has minimal impact. This highlights that PointNet's learned features are more dependent on orientation than on dense sampling.

Q4. Bonus Question - Locality with DGCNN

Classification Results: DGCNN vs PointNet

Model Overall Accuracy Chair Vase Lamp
PointNet (Q1) 98.32% 99.84% 91.18% 97.44%
DGCNN (Q4) 97.59% 99.84% 82.35% 98.29%

Segmentation Results: DGCNN vs PointNet

Model Overall Accuracy Best Case Worst Case
PointNet (Q2) 90.25% 99.62% 44.40%
DGCNN (Q4) 91.43% 100.00% 41.02%

Visual Comparisons

Segmentation Examples

High-quality case: Both models perform well, but DGCNN achieves near-perfect segmentation. Note that I had to reduce the number of points for DGCNN due to memory constraints.

PointNet segmentation
PointNet
Obj 397: 99.37%
DGCNN segmentation
DGCNN
Obj 397: 99.80%

Challenging case: Both models struggle with complex geometry

PointNet difficult case
PointNet
Obj 351: 45.67%
DGCNN difficult case
DGCNN
Obj 351: 41.02%

Classification Examples

Correct predictions: DGCNN successfully classifies objects across all categories

DGCNN chair
Chair (Correct)
DGCNN vase
Vase (Correct)
DGCNN lamp
Lamp (Correct)

Analysis and Interpretation

Classification Performance

Unexpectedly, DGCNN performs slightly worse than PointNet (97.59% vs 98.32%).

Segmentation Performance

DGCNN shows improvement: 1.18% accuracy gain (90.25% --> 91.43%) demonstrates the value of local geometric features for part-level tasks.