Head Tracking Using Color Histograms in OpenCV

Description

My final project for CSC 572 is a head tracking implementation using Intel's OpenCV computer vision library. It generates a histogram from an input image from a webcam and uses the histogram to identify the region of subsequent images most likely to be the user's face.

When the program starts there is by default no histogram used for tracking. I set it up so that an image can be captured by pressing the 'g' key on the keyboard to "grab" the image. This is the "model" histogram which is used for modeling the color of skin. The program shows histograms of both the model and the test image, which is the input video frame in which we want to identify the user's face. Pixels in the "Back Projection" window are lit brightly when their color matches the histogram. Notice that initially most of the back projection window is white. This is because the model histogram is generated from the test image (because we haven't input a model image yet). The reason some of the pixels are dark is that they contain colors that are less common in the image, so they don't meet the probability threshold for the histogram.

The next image shows how the back projection changes when I input a histogram image. In this example I held the webcam by my cheek to get a shot filled with the skin color on my face. This gave good results in this lighting. In practice the quality of the results depends a lot on the lighting conditions. In other cases it worked just as well to hold my hand up to the camera and get the skin color off the back of my hand. Ideally one would use a histogram that represents skin more generically and can work for people of different complexions. More on that later

Notice that the back projection is grainy and there is a lot of noise. This is due to noise in both the model histogram and the input frame.

In this stage I clean up the noise by using the "erode" function, which shrinks isolated bright peaks. In my program you can turn on erode by pressing "e". This does two iterations of erosion using the appropriately named OpenCV function "erode". Depending on other lighting conditions or the histogram you use, you might also want to use the "dilate" function (by pressing the "d" key). This causes bright areas to fill in.

Finally, I added blur to the input image to smooth away some of the noise. The effect in this example is hard to judge as an improvement or degradation of the quality. In some cases using blur can greatly increase the results though. You can access this function by pressing "b".

Summary and Results

This project essentially confirms Birchfield's findings that color histogram data can work surprisingly effectively for head tracking, even by itself. However, there is clearly a lot of fine tuning of the parameters necessary to make it work robustly under varying conditions. One interesting thing I found however, was how it performed for people of differing skin tones. It turns out that the histogram generated from my skin didn't work very well for other people, even of similar skin tone, and vice versa. However, there was one individual I tested whose histogram worked well for everyone.

Ultimately I plan to use a head tracking system as part of a project for experimenting with view-dependent camera projections in 3D (OpenGL). It will most likely turn out to be the case, as Blonski suggests, that color histogram data alone isn't accurate enough for this purpose, especially because of problems estimating depth with this approach. I may have to turn to other methods such as template-based tracking.

There are some improvements I'd like to make to this program. In particular I would like to add more image processing to both the test image and the histogram, such as thresholding, which will take some more experimentation and tweaking. I would also like to add the ability to accumulate the histogram from multiple source images to make one that is compatible for many different people.

Summary of Controls

All of the controls are toggles.

"g" - grab screen shot for histogram
"h" - head tracking
"b" - blur
"e" - erode (before dilate)
"d" - dilate
"r" - eRode (after dilate)

Resources and references

Bradki, Gary and Adrian Kaehler. "Learning OpenCV: Computer Vision with the OpencV Library". O'Reilly, 2008.

“OpenCV 2.1 C++ Reference”. http://opencv.willowgarage.com/documentation/cpp/index.html

Birchfield, Stan. "Elliptical Head Tracking Using Intensity Gradients and Color Histograms". Stanford, 1998.

Blonski, Brian. "The Use of Contextual Clues in Reducing False Positives in an Efficient Vision-Based Head Gesture Recognition System". Cal Poly, 2010.