Assignment 5 — Q1: Classification

Author: Minghao Xu — Andrew ID: mxu3

Model: PointNet-like classifier (custom lightweight implementation)

Training: 30 epochs, batch size 16, Adam optimizer (lr=0.001)

Test Accuracy

Best test accuracy: 0.9738 (97.38%) — checkpoint: ./checkpoints/cls/best_model.pt

Sample Predictions

sample 0
Sample 0 — predicted / true shown on the image
sample 1
Sample 1 — predicted / true shown on the image
sample 2
Sample 2 — predicted / true shown on the image

Failure Examples

fail chair
Chair — FAIL idx=543, true=chair, pred=lamp
fail vase
Vase — FAIL idx=619, true=vase, pred=lamp
fail lamp
Lamp — FAIL idx=750, true=lamp, pred=vase

Short Analysis

The implemented PointNet-like model reaches high accuracy (~97.4%) after 30 epochs. Visual inspection of failure examples indicates common causes: silhouettes shared across classes (vase vs lamp), sparse sampling in some views, and intra-class shape variability. Improving robustness could include data augmentation (rotations, jitter), ensemble models, or stronger local-feature architectures (PointNet++ / DGCNN).


Q2: Segmentation

Model: PointNet-style segmentation network (shared MLP + global feature concatenation)

Training: 30 epochs, batch size 8, Adam optimizer (lr=0.001)

Segmentation Test Accuracy

Overall test accuracy: 0.8830 (88.30%) — checkpoint: ./checkpoints/seg/best_model.pt

Per-object Visualizations

Below are 5 example objects (first two are low-accuracy failures, the other three are high-accuracy examples). For each object we show the ground-truth segmentation and the model prediction (animated GIFs). Per-object accuracy is reported under each pair.

gt 0 pred 0
object idx=163 — accuracy = 0.3955 (failure)
gt 1 pred 1
object idx=351 — accuracy = 0.4298 (failure)
gt 2 pred 2
object idx=404 — accuracy = 0.9721
gt 3 pred 3
object idx=56 — accuracy = 0.9282
gt 4 pred 4
object idx=477 — accuracy = 0.9317

Segmentation Analysis

The segmentation model achieves reasonably high overall point-wise accuracy (~88%). The two low-accuracy examples show common failure modes: heavy class imbalance across parts (small semantic parts are easily missed), and ambiguous local geometry where the model confuses adjacent part labels. The high-accuracy examples demonstrate the model's ability to correctly capture both global shape and local part boundaries in many cases.


Q3: Robustness Analysis

I ran two experiments to probe robustness for both tasks: (A) Vary the number of input points; (B) Rotate point clouds around the vertical (z) axis. Below I present compact, side-by-side GIF comparisons arranged in tables so each variant is directly comparable to the original.

Experiment A — Number of Points (table)

Procedure: for each test object we subsampled the point cloud to 10000, 2048, 1024, 512 points (seeded) and measured test accuracy. The table below shows classification & segmentation accuracy, and for segmentation we show one example object (idx=0) with ground-truth and model prediction GIFs side-by-side.

Num Points Classification Acc Segmentation Acc Seg GT (idx=0) Seg Pred (idx=0)
10000 0.9738 0.8830 seg gt 10000 seg pred 10000
2048 0.9685 0.8818 seg gt 2048 seg pred 2048
1024 0.9706 0.8791 seg gt 1024 seg pred 1024
512 0.9601 0.8719 seg gt 512 seg pred 512

Interpretation

Reducing point count from 10k down to a few hundred slightly degrades performance for both tasks (numbers shown above). The GIFs illustrate that small local details can be missed at lower sampling densities, while global shape cues are still often preserved.

Experiment B — Rotation around Z (table)

Procedure: we rotate test point clouds around z by angles 0°, 15°, 30°, 45°, 60° and measure accuracy. Below each row shows the numeric accuracy and one example object (idx=0) with rotated GT and prediction GIFs side-by-side — the GIFs are aligned and the same canvas size so rotation-only effects are visible without apparent shape-scaling artifacts.

Angle Classification Acc Segmentation Acc Seg GT (idx=0) Seg Pred (idx=0)
0.9748 0.8826
15° 0.9622 0.8701
30° 0.9202 0.7844
45° 0.7261 0.7086
60° 0.6211 0.6370

Rotation notes

The GIFs above are rendered on fixed-size canvases so rotation shows orientation changes only; axis limits are kept consistent across variants to avoid perceived shape scaling. The numeric accuracy drop is reported alongside each row.

Q4: Locality (DGCNN)

Model: DGCNN-style EdgeConv (local neighborhood feature aggregation via KNN)

Training (classification): 40 epochs, batch size 16, subsampled to 2048 points for graph construction. Best test accuracy: 0.9832 — checkpoint: ./checkpoints/cls/dgcnn/best_model.pt

Training (segmentation): 30 epochs, batch size 8, subsampled to 2048 points. Segmentation test accuracy (dgcnn): 0.9135 — checkpoint: ./checkpoints/seg/dgcnn/best_model.pt

Segmentation Visualizations (DGCNN)

Below are five example objects from the test set. Each row shows the ground-truth animated view (top) and the DGCNN prediction (bottom).

dgcnn gt 0 dgcnn pred 0
object (DGCNN) — per-object accuracy shown during eval
dgcnn gt 1 dgcnn pred 1
object (DGCNN) — low-accuracy examples shown above
dgcnn gt 2 dgcnn pred 2
object (DGCNN) — high-accuracy example
dgcnn gt 3 dgcnn pred 3
object (DGCNN)
dgcnn gt 4 dgcnn pred 4
object (DGCNN)

Q4 Analysis

DGCNN improves both tasks: classification rose from ~97.38% to ~98.32%, and segmentation improved from ~88.30% to ~91.35% (point-wise). Visual comparisons show DGCNN is better at resolving local part boundaries in many cases, thanks to EdgeConv-style local aggregation. However, DGCNN requires subsampling (2048 points) to keep graph construction feasible; this is a trade-off between locality and memory/time.