Test accuracy: 95.8%
Below are some visualizations of a few test point clouds that were classified correctly by the trained model:
Chair

Vase

Lamp

Below are visualizations of 1 failure prediction for each class:
True label: chair // Prediction: lamp. This chair is folded up which may have confused the model.

True label: vase // Prediction: lamp. This vase does have features that resemble a lamp.

True label: lamp // Prediction: vase. The body of this lamp is shaped like a vase.

Test accuracy: 88.9%
Below are visualizations of segmentation results for some objects (ground truth on left, prediction on right).
These are some good predictions -- these examples are what we typically think of when we think "chair", with well defined parts such as the seat, back, and arms, so the model generally does well:
prediction accuracy: 95%

prediction accuracy: 94%

prediction accuracy: 97%

These are some bad predictions:
prediction accuracy: 80%. From the point cloud alone, the bottom portion of the back of the chair is difficult to distinguish from the back of the seat cushion.

prediction accuracy: 80%. This chair has a shape that is distinct from the typical "chair" seen in the dataset, so the model appears to struggle with segmenting the seat of the chair from the legs (and indeed it is debatable if this chair has legs).

prediction accuracy: 89%. The legs of this chair are a more complex shape than other chairs in the dataset, so the model appears to struggle with segmenting the seat apart from the legs.

For this first experiment, I decreased the number of points per object by setting --num_points to 1000 (compared to 10000 before).
Classification
Test accuracy: 95.8%. This overall test accuracy is the same as before, and the failure examples shown below are similar to the ones from before (see Part 1), where certain objects have confusing shapes that can reasonably mislead the model to misclassify them. So the trained classification model seems relatively robust to having an order of magnitude fewer points per object.
Successfully classified examples (in order: chair, vase, lamp):

Failure examples:
This chair was incorrectly classified as a lamp:

This vase was incorrectly classified as a lamp:

This lamp was incorrectly classified as a vase:

Segmentation
Test accuracy: 88.9%. This overall test accuracy is the same as before (see Part 2), and the "bad" segmentation examples shown below are largely due to the chairs having some confusing geometry that makes it hard to distinguish between chair parts, so the trained segmentation model seems relatively robust to having fewer points per object.
Good examples (ground truth on left, prediction on right):
prediction accuracy: 99%

prediction accuracy: 97.5%

Bad examples (ground truth on left, prediction on right):
prediction accuracy: 83.8%

prediction accuracy: 68.8%

prediction accuracy: 72.4%

For this second experiment, I added a small rotation (sampled uniformly randomly between -15 and 15 degrees) about one axis to each point cloud object. For both the classification and segmentation models, test accuracy dropped compared to before, and qualitatively the segmentation results are much worse, so it seems both models, especially the segmentation one, are not so robust to rotations of the point clouds.
Classification
Test accuracy: 95.1% (compared to 95.8% from Part 1).
Successfully classified examples (in order: chair, vase, lamp):

Failure examples:
This chair was incorrectly classified as a lamp:

This vase was incorrectly classified as a lamp:

This lamp was incorrectly classified as a vase:

Segmentation
Test accuracy: 81.3% (compared to 88.9% in Part 2).
"Good" examples - these are reasonable but we can see the model is confused by the rotation (ground truth on left, prediction on right):
prediction accuracy: 90%

prediction accuracy: 94%

Bad examples (ground truth on left, prediction on right):
prediction accuracy: 53%

prediction accuracy: 58%

prediction accuracy: 61%
