Many different approaches to object recognition and tracking exist. Simply
google for “Image tracking,” and you will find a plethora of results. The method
that I have found and developed works extremely well for the application of
tracking RRRobot. There are several main functions that are used throughout the
code. The first function is a color space converter. We work in HSV (hue,
saturation, value) space in this program. The reason is that HSV seems to be
consistent in hue values regardless of the lighting conditions. To do the
conversion, we use the IPPI library to convert a buffer from RGB to HSV. After
the buffer is converted, we threshold the image for certain pixel values. The
red paper squares that we use have a hue value between zero and ten, so we
threshold all values that are above 10 to 255.
After image thresholding, we need to do analysis on
the image. For this, we use a binary partition scheme to find out the likely
sectors in which one of the red square might occur, and then we do a block
analysis on a small region. Since we know the size of the dots ahead of time,
we know about how many pixels we expect to be red in the frame. Using the IIPI
function countInRange, we look first at one half of the image. Depending upon
the number of red pixels that occur in that region, we either look at the other
half of the image or look at the first half in more detail. We do this up to
three times (so a factor of 8 reduction in the search space). We then look at
10 by 10 blocks of the image and count the number of pixels that are red within
that region. If the number is greater than some threshold (70 is the value that
I found to work well through experimentation), we mark that sector as a
potential candidate. We do this over multiple sectors and construct lists of potential
cells where the blob might be. If enough of these potential sectors are in the
same region, then we know that we have located a blob.
Once a blob has been found, we need to calculate
the center of the region. We do this by examining an enlarged block
(approximately twice as large as the blob itself), and take a weighted average
of the red pixels to find the center location. Once all four blobs have been
found, the image processing part becomes much easier. Once we have found the
blobs, we make the assumption that they don’t move especially far from one
frame to the next. Therefore, we only search a small portion of the screen on
future scans, saving much processing time. On an old machine (~400 mHz, PII),
we are able to process the image at 25 frames per second.
If you are interested in writing your own image
tracking software, I would recommend looking on the web for examples of
software that has already been created. Hardware is so good today, that as long
as you aren’t stupid in programming your algorithms, getting high performance
from software based image processing should be relatively straightforward. Take
advantage of API’s that are optimized for working with memory, such as the IPPI’s.
Here is a picture of the software
at work. The four red blobs that are used to mark the edges are shown, as well
as the bounding box that connects the four blobs: