Image Processing Reference
Learning Parameters in Video and Image
Virtually all video and image processing systems require a number of parameters
to be defined. For example a threshold value to convert a gray-scale image into a
binary image, the size of a filter kernel, the minimum and maximum allowed values
of a feature, etc. Defining suitable values for these parameters is a crucial task every
designer/programmer is faced with. This appendix provides a guideline for aiding
In general, on-line video-based systems and off-line image-based systems differ a
lot. When you have to process a single image off-line you can try different pa-
rameters in your algorithms until you achieve the desired output. When you are
processing on-line video data, however, you do not know exactly what the images
to process look like and your parameters can therefore not be tuned to a particular
image. So, what can then be done?
The answer is that we train our system and hereby learn suitable values for the
parameters. By training we mean that we capture images off-line in situations simi-
lar to the situation the system is required to be operating in. These captured training
images are then analyzed and the parameters derived. Let us look at an example.
Say your task is to segment a human hand in a video sequence. You decide to
solve the problem through the use of HSI colors. That is, you assume that the reddish
skin-color is unique in the image and by thresholding the Hue and Saturation values
you can find skin-pixels and hence the hand. The algorithm will be similar to 4.14.
The question is now, how do you define the four threshold values: T Sat min , T Sat max ,
T Hue min and T Hue max ?
These four threshold values represent the different values a skin-pixel can be.
So, if we capture 100 representative training images of a hand and look at the Hue
and Saturation values, then we can get some input to the choice of threshold values.
One could simply find the minimum and maximum values of Hue and Saturation
and use these as the threshold values. This will give a perfect segmentation of all
training pixels. However, this is likely to also result in some non-skin pixels being
segmented. In Fig. C.1 (a) a fictive histogram of all the 100 Hue values from the
training images are shown. It is evident that the minimum and maximum values are