Learning for 3D Vision: Assignment 5

Name: Ahish Deshpande
Andrew ID: ahishd

1. Classification Model

1.1. Test Accuracy

The test accuracy obtained was 96.54%.

1.2. Visualization

Classification
Correct Classification Examples
Predicted: Chair	Predicted: Vase	Predicted: Lamp


Incorrect Classification Examples
Predicted: Chair	Predicted: Vase	Predicted: Lamp
GT Class: Lamp	GT Class: Lamp	GT Class: Vase
No other incorrect classification!	GT Class: Lamp	GT Class: Vase
Interpretation
Chair	Vase	Lamp
The accuracy for classifying chairs is quite high, as is also evident in the confusion matrix below. The only incorrect classification seen is for a lamp. It is possible to see why the model could be confused by that example, as the lamp (shown above), does appear to vaguely resemble a high chair. This is however, relatively uncommon.	The accuracy for classifying vases is also quite high, and is visible from the examples above, the lamps do somewhat resemble vases. In other words, it is possible for vases to exist in the shape of the above lamps. The lack of plants however, should have helped the model identify that the above objects were more likely to be lamps.	Similar to the confusion of lamps being vases, in the above example we can see the reverse case also being true. For the above objects, it can be seen why the model would confuse these objects as lamps. The first object does somewhat resemble an inverted open lamp, while the plant in the second point cloud can easily be mistaken to be a lamp on a stand.

Confusion Matrix
	Predicted: Chair	Predicted: Vase	Predicted: Lamp
GT: Chair	617	0	0
GT: Vase	0	90	12
GT: Lamp	1	20	213

2. Segmentation Model

2.1. Test Accuracy

The test accuracy obtained was 87.39%.

2.2. Visualizations

Classification
Good Prediction Examples
Ground Truth	Prediction	Accuracy	Interpretation
		94.5%	Most of the chair is segmented accurately, with minor inaccuracies at the intersection of the blue and red regions.
		97.05%	The model is able to identify distinct regions of the chair even with significant differences in shape. Eg. the seating region, the legs, the armrests, etc.
		85.15%	The model identifies distinct types of armrests as well, but has inaccuracies in determining the boundaries of the red region.
		72.69%	Similar to the above, most of the chair is segmented accurately, with inaccuracies showing majorly in the boundaries of the red region from the back.
Bad Prediction Examples
Ground Truth	Prediction	Accuracy	Interpretation
		47.02%	The model entirely misses identifying the headrest region in the point cloud. A reason for this could be that the model has seen a lot of data where there is no headrest, and thus the entire upright portion is classified as a backrest. Since the headrest is usually much smaller than the backrest, the model possibly does not get heavily penalized for missing this.
		58.79%	Based on the above examples, the model has learned to segment the "base" or "legs" of the chair in blue. However, in this example, it does not consider the entire bottom half of the chair to be the base, but only the very bottom, thus resulting in incorrect segmentation.

3. Robustness Analysis

3.1. Experiment 1: Effect of Rotation

3.1.1. Procedure

The trained classification and segmentation models were tested on rotated versions of the test data in order to determine the effect on accuracy. The entire test set was tested for varying amounts of rotation, ie. 10°, 20°, 30°, 45°, 90°, and 180°.

3.1.2. Classification Results

Overall Accuracy
No rotation	10° rotation	20° rotation	30° rotation	45° rotation	90° rotation	180° rotation
96.54%	94.54%	93.07%	89.50%	71.45%	33.05%	55.40%

Failure Cases Visualization
Rotation	Rotated Point Cloud	Ground Truth	Prediction pre-rotation	Prediction post-rotation	Interpretation
10°		Vase	Vase	Lamp	In this case, the model identifies the object as a lamp, possibly due to the plant now appearing vertical and resembling the top of the lamp more. This also indicates that the orientation of the point cloud makes a difference in classification output.
10°		Lamp	Lamp	Vase	In this case, the opposite of the previous scenario is seen, where a slight rotation causes the model to believe that the lamp is a vase. This could be because now that the top of the lamp is tilted, it could be interpreted as a plant in a vase.
20°		Chair	Chair	Vase	At about 20° of rotation, we start seeing the accuracy of the chair class start to go down as well, and objects such as these are identified as vases, possibly due to the bottom not resembling a chair at all.
30°		Lamp	Lamp	Chair	At about 30° of rotation, we also see lamps being misclassified as chairs. Looking back at the confusion matrix in Q1 shows that this was very rarely the case without rotation.
45°		Lamp	Lamp	Chair	At about 45° of rotation, we see the most significant drop in accuracy, most likely due to the fact that objects stop looking like themselves. For eg. in this case, the object does not look much like a lamp anymore. This implies that the orientation of the input makes a difference to the model.
180°		Lamp	Lamp	Vase	In most cases, as the model depends on the orientation of the input, rotations of 90° and 180° are extremely difficult for the model to classify correctly. However, in this case, the lamp inverted ends up looking like a vase.

3.1.3. Segmentation Results

Accuracy Change with Rotation
	Ground Truth	No Rotation	10° Rotation	20° Rotation	30° Rotation	45° Rotation	90° Rotation	Interpretation
Example 1
Segmentation								Looking at the segmentation results as the object is rotated, we can see the model has a bias to detect horizontal separation boundaries, thus reducing accuracy with more rotation.
Accuracy	(Ground Truth)	94.5%	92%	85%	78%	66%	47%
Example 2
Segmentation								In this example, we see a lesser bias towards detecting horizontal boundaries, and thus the performance at higher rotations is than that in the previous example.
Accuracy	(Ground Truth)	97.05%	95.35%	78%	76.98%	78.69%	58.16%
Example 3
Segmentation								The accuracy around the legs of the chair remains high even at large rotations, however, accuracy around other regions falls quickly.
Accuracy	(Ground Truth)	85.15%	84.3%	75.77%	64.87%	47.47%	45.85%

3.1.4. Conclusion

Both the classification and segmentation models show a significant dependence on the orientation of the input point cloud, but are relatively unaffected by minor rotations of up to 15-20° as shown above. However, when the input point cloud is rotated further, a significant loss in accuracy is seen.

3.2. Experiment 2: Effect of Number of Points

3.2.1. Procedure

The trained classification and segmentation models were tested on undersampled versions of the test data (ie. reduced number of input points) in order to determine the effect on accuracy. The entire test set was tested for varying amounts of sampled points, ie. 100, 500, 750, 1000 and 5000.

3.2.2. Classification Results

Overall Accuracy
100 points	500 points	750 points	1000 points	5000 points	10000 points
88.56%	94.4%	95.2%	96.01%	96.32%	96.54%

Failure Cases Visualization
Points Sampled	Sampled Point Cloud	Ground Truth	Prediction pre-sampling	Prediction post-sampling	Interpretation
100		Chair	Chair	Lamp	As only 100 points are sampled, there isn't enough global information available for the classifier to accurately determine the class. Note: The overall accuracy drop is still only ~8%.
500		Vase	Vase	Lamp	The global information provided by this subset of points is still not enough to help in accurate classification of the object.
750		Lamp	Lamp	Vase	Due to the relatively higher similarity between the vase and lamp class, sampling 750 points is not enough to distinguish between them with confidence.
1000		Lamp	Lamp	Vase	The same issue as the above case remains even at 1000 sampling points.
5000		Vase	Vase	Lamp	The few cases in which differences are observed show that when more detail is present in terms of density of the point cloud, the model can perform better.

3.2.3. Segmentation Results

Accuracy Change with Points Sampled
	Ground Truth	100 Points	500 Points	750 Points	1000 Points	5000 Points	10000 Points	Interpretation
Example 1
Segmentation								From the results in this example, we can see good performance even upto 500 points (5% sampling), but we see a dropoff in performance below that.
Accuracy	(Ground Truth)	82%	95.4%	94.5%	94.9%	94.54%	94.5%
Example 2
Segmentation								In this particular example, we see good performance even at 100 points. I believe this is due to the fact that every part of this object is already thin, and thus further thinning does not affect performance too much.
Accuracy	(Ground Truth)	95%	97%	96.2%	97.5%	97.08%	97.05%
Example 3
Segmentation								In wider objects such as this one, we can see the effect of sparsity on the model more clearly, and there is a significant drop in performance on reducing the total number of points.
Accuracy	(Ground Truth)	53%	70.6%	75.4%	74.9%	71.86%	72.69%

3.2.4. Conclusion

Both the classification and segmentation models show good resilience to a reduction in number of points (upto about 5% points selected out of the total). This could be due to the fact that even though the total number of points are less, the resulting points are able to reasonably represent the shape of the object. In cases where the sampling results in a significant reduction in structural information, we do see a significant loss in performance.