Abhishek Mathur

Andrew ID: armathur

Assignment 5: PointNet Classification and Segmentation

Q1. Classification Model

Test Accuracy

Test Accuracy: 97.80%

Visualizations

Random test point cloud samples with predictions:

Sample 1

Ground Truth: Chair
Predicted: Chair

Sample 2

Ground Truth: Chair
Predicted: Chair

Sample 3

Ground Truth: Chair
Predicted: Chair

Sample 4

Ground Truth: Chair
Predicted: Chair

Sample 5

Ground Truth: Chair
Predicted: Chair

Failure Cases

Here are some examples from the rotation experiment where the model gets confused:

Chair failure

Chair Misclassification

Ground Truth: Chair
Predicted: Vase

Interpretation: At 45° rotation, the model misclassifies this chair as a vase. The vertical structure of the rotated chair probably resembles a vase to the network since it mostly saw upright chairs during training.

Chair to Vase

Chair → Vase (90° rotation)

Ground Truth: Chair
Predicted: Vase

Interpretation: Rotating a chair 90° makes it look cylindrical to the model - hence the vase prediction. This happens frequently at this angle.

Chair to Lamp

Chair → Lamp (180° rotation)

Ground Truth: Chair
Predicted: Lamp

Interpretation: Flipping a chair upside down leads to lamp predictions. The network clearly learned to recognize specific orientations rather than actual 3D geometric properties.

Q2. Segmentation Model

Test Accuracy

Overall Test Accuracy: 90.22%

Segmentation Results

Visualization of 5+ objects with corresponding ground truth:

Object 0 (Good Prediction)

Object 0 GT

Ground Truth

Object 0 Pred

Prediction

Object 1 (Good Prediction)

Object 1 GT

Ground Truth

Object 1 Pred

Prediction

Object 2 (Good Prediction)

Object 2 GT

Ground Truth

Object 2 Pred

Prediction

Object 3 (Bad Prediction)

Object 3 GT

Ground Truth

Object 3 Pred

Prediction

Object 4 (Bad Prediction)

Object 4 GT

Ground Truth

Object 4 Pred

Prediction

Object 5

Object 5 GT

Ground Truth

Object 5 Pred

Prediction

Object 6

Object 6 GT

Ground Truth

Object 6 Pred

Prediction

Interpretation: The segmentation model does a good job identifying different chair parts (back, seat, legs, armrests). It works well on simple chairs but struggles with more complex or unusual designs. The trickiest parts seem to be the boundaries where different components meet, like where the legs connect to the seat.

Q3. Robustness Analysis

Experiment 1: Rotation Invariance Test

Procedure

I rotated all the test point clouds around the z-axis by different angles (0°, 45°, 90°, and 180°) and checked how well the model still classified them. This tests whether the network actually learned rotation-invariant features or just memorized what upright objects look like.

Classification Results

Rotation Angle Test Accuracy Accuracy Change
0° (Original) 98.11% -
45° 30.33% -67.78%
90° 23.08% -75.03%
180° 41.66% -56.45%

Sample Visualizations

0 degree

0° Rotation
GT: Chair, Pred: Chair ✓

45 degree success

45° Rotation (Rare Success)
GT: Chair, Pred: Chair ✓

45 degree fail

45° Rotation (Typical Failure)
GT: Chair, Pred: Vase ✗

90 degree

90° Rotation
GT: Chair, Pred: Vase ✗

180 degree

180° Rotation
GT: Chair, Pred: Lamp ✗

Segmentation Results

0° Rotation (Baseline)

Seg 0° GT

Ground Truth

Seg 0° Pred

Prediction

Result: Good segmentation accuracy

45° Rotation

Seg 45° GT

Ground Truth

Seg 45° Pred

Prediction

Result: Segmentation quality degrades

90° Rotation

Seg 90° GT

Ground Truth

Seg 90° Pred

Prediction

Result: Significant accuracy loss

180° Rotation

Seg 180° GT

Ground Truth

Seg 180° Pred

Prediction

Result: Poor segmentation performance

Interpretation: The model exhibits severe lack of rotation invariance. Despite PointNet's theoretical rotation-invariance through max-pooling, the network clearly memorized object orientations from the training data. At 90° rotation, accuracy drops to near-random performance (23.08%). Rotation augmentation during training would be necessary for real-world deployment.

Experiment 2: Number of Points Robustness

Procedure

I tested the model with different point cloud densities by randomly sampling 1000, 2500, 5000, 7000, and 10000 points from each test object. This simulates what would happen with different sensors or sampling methods in practice.

Classification Results

Number of Points Test Accuracy Accuracy Change
10000 (Original) 97.80% -
7000 98.11% +0.31%
5000 98.01% +0.21%
2500 97.90% +0.10%
1000 97.59% -0.21%

Sample Visualizations

10000 points

10000 Points
Accuracy: 97.80%

7000 points

7000 Points
Accuracy: 98.11%

5000 points

5000 Points
Accuracy: 98.01%

2500 points

2500 Points
Accuracy: 97.90%

1000 points

1000 Points
Accuracy: 97.59%

Segmentation Results

1000 Points

Seg 1000 GT

Ground Truth

Seg 1000 Pred

Prediction

Points: Sparse but maintains structure

2500 Points

Seg 2500 GT

Ground Truth

Seg 2500 Pred

Prediction

Points: Good balance

5000 Points

Seg 5000 GT

Ground Truth

Seg 5000 Pred

Prediction

Points: Dense representation

7000 Points

Seg 7000 GT

Ground Truth

Seg 7000 Pred

Prediction

Points: Very dense

Interpretation: The model demonstrates strong robustness to varying point cloud densities - accuracy remains stable between 97.59% and 98.11% across a 10x variation in point count. Interestingly, performance slightly improves with fewer points (7000 appears optimal), possibly due to reduced noise. The max-pooling operation effectively captures important geometric features regardless of input density, making it suitable for applications with varying sensor quality.