Q1: Classification Model 40 points
Test Accuracy: 97.90%
Random Test Point Clouds with Predictions
Below are visualizations of random test point clouds with their predicted classes.
Sample 1 - Predicted: Chair, Actual: Chair ✓
Sample 2 - Predicted: Vase, Actual: Vase ✓
Sample 3 - Predicted: Lamp, Actual: Lamp ✓
Sample 4 - Predicted: Vase, Actual: Vase ✓
Overall Performance: The model gets 97.90% test accuracy, showing it's good
classification capability across all three object categories (chairs, vases, and lamps). The successful
predictions shown above represent typical cases where the model correctly identifies distinctive
features of the classes.
Failure Cases Analysis
Chair Failure (Predicted: Vase)
Predicted: Vase ✗, Actual: Chair
Interpretation: The model misclassified this chair as a vase, likely due to the chair having a
rounded or cylindrical backrest that resembles the profile of a vase. The chair may have an unconventional
design with smooth curved surfaces and minimal angular features typical of chairs. This suggests the model may
be relying heavily on overall silhouette and curvature patterns rather than discriminative structural details
like distinct legs, armrests, or a flat seat surface. The failure highlights that when geometric ambiguity
exists, PointNet's global feature aggregation may not capture sufficient local structural context to distinguish
between similar overall shapes.
Vase Failure (Predicted: Lamp)
Predicted: Lamp ✗, Actual: Vase
Interpretation: This vase was incorrectly classified as a lamp. The elongated vertical shape
with a wider base may have resembled a lamp stand, particularly if the vase has decorative elements at the top
that could be confused with a lampshade or has a tall narrow neck similar to a lamp post. Both vases and lamps
can share similar vertical, axially-symmetric structures, making them challenging to distinguish based solely on
global geometric features. This error indicates that the model may need more discriminative features to
differentiate between objects with similar aspect ratios and vertical alignment but different functional
semantics.
Lamp Failure (Predicted: Vase)
Predicted: Vase ✗, Actual: Lamp
Interpretation: The model confused this lamp with a vase, possibly because the lamp has a
simple columnar design without a prominent lampshade, or the lampshade's shape closely resembles a vase's
opening or body. Minimalist lamp designs with smooth surfaces and symmetric profiles can exhibit geometric
properties nearly identical to decorative vases. This misclassification reveals the model's limitation in
handling objects with ambiguous geometric features where functional context cannot be gotten from shape alone. The error suggests that pure
geometry-based classification may be insufficient for objects with overlapping form factors.
Summary of Failure Patterns: The failure cases reveal a consistent pattern: the model struggles
most with objects that share geometric similarity across categories, particularly when dealing with smooth,
curved, or axially-symmetric shapes. The confusion between vases and lamps
and between chairs and vases indicates that PointNet's global max-pooling may
be capturing high-level shape descriptors but missing fine-grained
structural details that would provide class-specific discriminative power. These errors account for the ~2% of
test samples where the model fails, suggesting that data augmentation with rotation and scaling, or
architectural improvements incorporating local context could improve geeralization to
geometrically ambiguous cases.
Q2: Segmentation Model 40 points
Test Accuracy: 90.40%
Segmentation Results
Below are segmentation results for 7 objects, including 2 bad predictions. Each visualization shows the ground
truth segmentation (left) compared with the model's prediction (right), where different colors represent
different chair parts
Object 0 - Good Segmentation
Ground Truth
Prediction
Analysis: There is moderately accurate segmentation with clear part boundaries. The model successfully identifies
all major chair components including the seat, backrest, and legs. The part boundaries are well-defined, and
there is minimal confusion between adjacent parts. This represents an ideal case where the chair geometry is
canonical and part transitions are geometrically distinct.
Object 1 - Good Segmentation
Ground Truth
Prediction
Analysis: Accurate part segmentation with minor boundary errors. The model correctly segments
the major structural components, though there may be slight misclassifications at part junctions where geometric
features blend together. These boundary effects are common in point cloud segmentation due to the discrete
nature of point sampling and local geometric ambiguity.
Object 2 - Good Segmentation
Ground Truth
Prediction
Analysis: Well-segmented chair parts with clear distinction between seat, back, and legs. The
model demonstrates its ability to leverage both local geometric features (from the shared point feature
extraction) and global context to produce the correct part labels. The
consistency across symmetric parts (e.g., left and right legs) indicates the model has learned generalizable
part features.
Object 3 - Good Segmentation
Ground Truth
Prediction
Analysis: Consistent segmentation with good part recognition across all chair components. The
model maintains coherent labeling even for thin structures like chair legs, which can be challenging due to
sparse point sampling. This demonstrates the effectiveness of the point-wise feature learning approach combined
with global shape context.
Object 4 (Bad Prediction)
Ground Truth
Prediction
Analysis: There is poor segmentation with significant errors. The model struggles to correctly identify
part boundaries, particularly in regions where parts connect or transition (e.g., seat-to-leg junction,
seat-to-back connection). This may be because of ambiguous geometric features where various
parts share similar local normal directions or insufficient context locally in the point
neighborhood to disambiguate parts. This indicates PointNet's limitation in realizing
fine-grained local geometric relationships without explicit hierarchical feature aggregation.
Object 5 - Moderate Quality
Ground Truth
Prediction
Analysis: Reasonable segmentation quality with some minor classification errors at part
boundaries. While major parts are correctly identified, there are scattered misclassifications that likely occur
in transition regions or areas with geometric complexity. These errors are tolerable and represent the typical
performance level of the model on moderately challenging examples.
Object 9 (Bad Prediction)
Ground Truth
Prediction
Analysis: Significant segmentation errors with multiple misclassified parts. The model fails to
capture the correct part structure, with entire structural components potentially mislabeled (e.g., backrest
points classified as seat, or leg points confused with armrests). This failure could be caused by: (1) unusual
or complex chair geometry that is underrepresented in the training data, (2) occlusion-like effects in the point
cloud sampling where certain parts have very sparse point coverage, or (3) geometric ambiguity where parts merge
smoothly without clear boundaries. The widespread nature of the errors (not just boundary confusion) suggests
the global context feature may be misleading for this shape, causing the model to apply an incorrect part
labeling scheme. This highlights the fundamental challenge in part segmentation: the need to balance local
geometric detail with global structural understanding, which the basic PointNet architecture struggles with for
atypical or geo
Q3: Robustness Analysis 20 points
Experiment 1: Rotation Robustness
Procedure: I tested the model's robustness to rotation by rotating input point clouds around
the X-axis at various angles (0°, 15°, 30°, 45°, 60°, 75°, 90°). Each test point cloud was transformed using a
rotation matrix before being fed to the model. This tests whether the model has learned rotation-invariant
features or if it relies on canonical orientations seen during training.
Classification Task Results
Baseline Test Accuracy (0°): 97.90%
Test Accuracy at 90°: 27.70%
Comparison with Q1: -70.20% accuracy drop at 90° rotation
Detailed Results:
| Rotation |
Accuracy |
Change |
| 0° |
97.90% |
+0.00% |
| 15° |
94.54% |
-3.36% |
| 30° |
82.79% |
-15.11% |
| 45° |
54.88% |
-43.02% |
| 60° |
33.26% |
-64.64% |
| 75° |
28.12% |
-69.78% |
| 90° |
27.70% |
-70.20% |
Classification Visualizations
Comparison of classification performance at 0° (baseline) and 90° rotation:
Q1 Original - 0° (97.90%)
Rotated 90° (27.70%)
Segmentation Task Results
Baseline Test Accuracy (0°): 90.40%
Test Accuracy at 90°: 24.13%
Comparison with Q2: -66.27% accuracy drop at 90° rotation
Detailed Results:
| Rotation |
Accuracy |
Change |
| 0° |
90.40% |
+0.00% |
| 15° |
83.00% |
-7.40% |
| 30° |
72.62% |
-17.78% |
| 45° |
63.99% |
-26.41% |
| 60° |
45.00% |
-45.40% |
| 75° |
30.84% |
-59.55% |
| 90° |
24.13% |
-66.27% |
Segmentation Visualizations - Baseline (0°)
Q2 Original - Ground Truth
Q2 Original - Prediction (90.40%)
Segmentation Visualizations - 0° Rotation
0° - Ground Truth
0° - Prediction (90.40%)
Segmentation Visualizations - 45° Rotation
45° - Ground Truth
45° - Prediction (63.99%)
Segmentation Visualizations - 90° Rotation
90° - Ground Truth
90° - Prediction (24.13%)
Interpretation: Both models show HIGH sensitivity to rotation, with severe accuracy degradation
at larger rotation angles. Classification accuracy drops from 97.90% to 27.70% (70.20% drop), while segmentation
drops from 90.40% to 24.13% (66.27% drop). Both tasks exhibit similar vulnerability to rotation, indicating that
neither the classification nor segmentation network has learned rotation-invariant features. Even small
rotations cause noticeable degradation (15°: -3.36% for classification, -7.40% for segmentation), with
performance collapsing beyond 45°. This reveals that the models have memorized canonical orientations from
training rather than learning geometric properties invariant to rotation. The similar degradation patterns
across both tasks suggest this is a fundamental limitation of the basic PointNet architecture without T-Net or
data augmentation. The visualizations clearly show how the segmentation quality deteriorates as rotation
increases,
with part boundaries becoming increasingly confused at 45° and nearly random at 90°.
Experiment 2: Point Density Robustness
Procedure: I tested the model's robustness to varying point density by randomly sampling
different numbers of points from each object (100, 500, 1000, 2500, 5000, 7500, 10000 points). This evaluates
whether the model can maintain performance with sparse point clouds and how much geometric information is truly
needed for accurate classification.
Classification Task Results
Baseline Test Accuracy (10000 points): 97.90%
Test Accuracy at 100 points: 92.76%
Comparison with Q1: Only -5.14% accuracy drop with 100x fewer points
Detailed Results:
| # Points |
Accuracy |
Change |
| 100 |
92.76% |
-5.14% |
| 500 |
96.85% |
-1.05% |
| 1000 |
97.80% |
-0.10% |
| 2500 |
98.01% |
+0.11% |
| 5000 |
98.01% |
+0.11% |
| 7500 |
97.80% |
-0.10% |
| 10000 |
97.90% |
0.00% |
Classification Visualizations
Q1 Original (10000 points - 97.90%)
100 Points (92.76%)
Segmentation Task Results
Baseline Test Accuracy (10000 points): 90.40%
Test Accuracy at 100 points: 83.92%
Comparison with Q2: -6.48% accuracy drop with 100x fewer points
Detailed Results:
| # Points |
Accuracy |
Change |
| 100 |
83.92% |
-6.48% |
| 500 |
88.81% |
-1.59% |
| 1000 |
89.83% |
-0.57% |
| 2500 |
90.31% |
-0.09% |
| 5000 |
90.37% |
-0.03% |
| 7500 |
90.43% |
+0.03% |
| 10000 |
90.40% |
0.00% |
Segmentation Visualizations - 10000 Points
(Baseline)
10000 Points - Ground Truth
10000 Points - Prediction (90.40%)
Segmentation Visualizations - 5000 Points
5000 Points - Ground Truth
5000 Points - Prediction (90.37%)
Segmentation Visualizations - 1000 Points
1000 Points - Ground Truth
1000 Points - Prediction (89.83%)
Segmentation Visualizations - 100 Points
100 Points - Ground Truth
100 Points - Prediction (83.92%)
Interpretation: Both models demonstrate excellent robustness to reduced point density, in stark
contrast to their rotation sensitivity. Classification maintains 92.76% accuracy with just 100 points (5.14%
drop), while segmentation achieves 83.92% (6.48% drop). The slightly larger degradation in segmentation is
expected since it requires more local geometric detail for per-point labeling. Performance plateaus around 500
points for classification (96.85%) and 1000 points for segmentation (89.83%), showing that PointNet efficiently
extracts salient geometric features without requiring dense sampling. This robustness stems from the symmetric
max-pooling aggregation function, which is invariant to point set size and focuses on the most discriminative
features. The visualizations demonstrate that even with dramatically reduced point density (100 points vs
10000),
the model maintains coherent segmentation with only minor quality degradation at part boundaries. This confirms
that PointNet successfully learns compact shape representations that generalize across different sampling
densities,
making it practical for real-world applications with varying sensor quality and point cloud resolution.