Method for programming a vision system


Many different approaches to object recognition and tracking exist. Simply google for “Image tracking,” and you will find a plethora of results. The method that I have found and developed works extremely well for the application of tracking RRRobot. There are several main functions that are used throughout the code. The first function is a color space converter. We work in HSV (hue, saturation, value) space in this program. The reason is that HSV seems to be consistent in hue values regardless of the lighting conditions. To do the conversion, we use the IPPI library to convert a buffer from RGB to HSV. After the buffer is converted, we threshold the image for certain pixel values. The red paper squares that we use have a hue value between zero and ten, so we threshold all values that are above 10 to 255.

After image thresholding, we need to do analysis on the image. For this, we use a binary partition scheme to find out the likely sectors in which one of the red square might occur, and then we do a block analysis on a small region. Since we know the size of the dots ahead of time, we know about how many pixels we expect to be red in the frame. Using the IIPI function countInRange, we look first at one half of the image. Depending upon the number of red pixels that occur in that region, we either look at the other half of the image or look at the first half in more detail. We do this up to three times (so a factor of 8 reduction in the search space). We then look at 10 by 10 blocks of the image and count the number of pixels that are red within that region. If the number is greater than some threshold (70 is the value that I found to work well through experimentation), we mark that sector as a potential candidate. We do this over multiple sectors and construct lists of potential cells where the blob might be. If enough of these potential sectors are in the same region, then we know that we have located a blob.

Once a blob has been found, we need to calculate the center of the region. We do this by examining an enlarged block (approximately twice as large as the blob itself), and take a weighted average of the red pixels to find the center location. Once all four blobs have been found, the image processing part becomes much easier. Once we have found the blobs, we make the assumption that they don’t move especially far from one frame to the next. Therefore, we only search a small portion of the screen on future scans, saving much processing time. On an old machine (~400 mHz, PII), we are able to process the image at 25 frames per second.

If you are interested in writing your own image tracking software, I would recommend looking on the web for examples of software that has already been created. Hardware is so good today, that as long as you aren’t stupid in programming your algorithms, getting high performance from software based image processing should be relatively straightforward. Take advantage of API’s that are optimized for working with memory, such as the IPPI’s.

 

Here is a picture of the software at work. The four red blobs that are used to mark the edges are shown, as well as the bounding box that connects the four blobs: