Vision Algorithms Evaluation

Preston Ohta

1/28/14

Image Gathering

I acquired images of both targets using my HTC Evo 4G LTE smartphone camera using defualt settings. Using the pre-existing tape markings on the floor located 2, 4, 8, and 16 feet away from the targets (which I verified using a meter stick), I positioned my camera at the appropriate distances. The targets were centered on all images, and all images were taken at the same time and elevation to ensure consistent lighting and angle, respectively.

Data Collection Procedure

To create an adequate filter, I converted the input image to HSV format and displayed the H, S, and V values separately as grayscale images. From here, I could use Matlab's data tools to determine the specific HSV values of the tennis balls, and thus what value ranges I should set for my thresholds.

I then parsed through the image, using the threshold to generate a binary image (white if a pixel fits within the threshold, black if it does not). However, some portions of the tennis balls still get filtered out, such as the edges and the curved lines within the balls. To compensate, I used a small MATLAB disk filter (which we were allowed to use) to smooth rough edges and fill in gaps, which removed most of the gaps and holes.

I employed the double-raster method, which was covered in lecture, to group pixels together. By comparing a pixel's id value to those of its left and top neighbors, I could determine which object it belongs to, or if it was the beginning of a new object altogether.

Once all objects were detected, I searched for the four largest objects, which would be the tennis balls, since the threshold filter was calibrated specifically for them. To locate their centroids, I simply took the average of the x and y coordinates of the objects' pixels.

To measure distance, I used the width between balls as reference. I estimated the width by averaging the x and y coordinates of the centroids, subtracting the x and y coordinates of the first ball from them, then multiplying by two. I then used the law of similar triangles to estimate the distance between the target and the camera, using the pixels-per-inch and focal length specs of my cameraphone was the dimensions of my reference triangle. Thus, utilizing the ratio:

(target_width_in_pixels/pixels_per_inch)/actual_target_width = actual_target_depth/focal_length

I could solve for the actual distance between the camera and the target.

Analysis

Triangulation worked remarkably well, with the maximum error being only 8.7% and the smallest being under 0.1%. In observing the results, the following tend in error was noticed:

The data suggests that as the distance between the camera and the target increases, so does the error, and by a large, non-linear factor. This makes sense, as camera resolution decreases over distance (borders become less sharp) and lost pixel data (tennis ball pixels being accidentally filtered out) become more significant. Additionally, as distance increases, the decrease in width between two tennis balls resembles an exponential decay, such that small changes in the perceived width can have a large difference in distance approximation. Eventually, error will greatly increase as the effects of discretization (due to pixelization) become more significant. These effects are all evidenced by the images taken, shown below.

Additional, Potential Sources of Error

Although this vision algorithm worked very well. It is possible to introduce errors into the system. These include:

   -Other tennis balls of the same color in the image, positioned closer to the camera than those on the target

   -Other objects of the same color as the tennis balls that appear larger than the tennis balls in the image

   -Tilting the target

   -Different light settings

   -Different camera type/settings

   -Partially or completely obstructing one or more tennis balls on the target

If none of these are present, however, the algorithm is very reliable

Code

Triangulate.m centroid.m

Results

Small Target

Distance: 2 Feet


Threshold Image Segmented Image Filtered Image
Results



Distance: 4 Feet


Threshold Image Segmented Image Filtered Image
Results



Distance: 8 Feet


Threshold Image Segmented Image Filtered Image
Results



Distance: 16 Feet


Threshold Image Segmented Image Filtered Image
Results



Big Target

Distance: 2 Feet


Threshold Image Segmented Image Filtered Image
Results



Distance: 4 Feet


Threshold Image Segmented Image Filtered Image
Results



Distance: 8 Feet


Threshold Image Segmented Image Filtered Image
Results



Distance: 16 Feet


Threshold Image Segmented Image Filtered Image
Results