An Appearance-Based Prior for Hand Tracking - Advanced Concepts for Intelligent Vision Systems

Information Technology Reference

In-Depth Information

image areas. During testing, stage i is passed successfully if the weighted sum exceeds

a stage threshold t i :

M i

β i =

α ij h ij (

) ≥

t i

(1)

j =1

All components including weak classifiers, weights and thresholds are learned during

the training stage.

A detection occurs when an image area passes all N stages. For our method, we also

consider incomplete detections , that is, when the image area only passes s stages and

gets rejected by stage s

. We calculate a score o i

(

x, y

)

for an image area of scale i ,

centered at pixel

as follows. A completely successful detection has passed all N

stages, and hence is assigned the score o

(

x, y

)

s/N

N/N

. A partially successful

N .

Without k , the score is proportional to the number of passed stages. To smooth this

step function, k is set to the degree of success within a stage, in the range from zero to

exclusive one, k

detection has passed s stages, s

∈{ 0

, .., N

− 1 }

, and is assigned the score

Considering only one stage, k is ideally set proportional to the sum of weights below

the threshold t i :

∈ [0; 1)

β i −

β min

β min , where β min =mi A

α j h j (

)

(2)

t i −

j∈A

for any subset A of weights. Note that the weights α j can be positive or negative and

that the minimum achievable sum β min need not be zero. We avoid computing all com-

binations of weights to find β min and, instead, set it to a fixed value and ensure k

This has worked well in practice without negative impact on the generated probability

image.

≥ 0

3.2

Formal Justification of Prior

For this score to reflect the probability of a hand, care has to be taken during training to

provide the AdaBoost algorithm with a representative set of negative training images

per stage. If this set is too uniform, then the resulting stage will not proportionally

dismiss a more diverse set of negative test areas. In other words, if the first few stages do

not typically discard test areas at the same rate as later stages, then the score obtained

from the first few stages will be artificially inflated. We trained a Viola-Jones-based

detector on hands in arbitrary postures and varied the negative training set to avoid such

artifacts, allowing us to obtain this appearance-based posture-independent score that an

area's appearance could be attributed to a hand.

To aid in placing tracked LK-features, it is desirable to know a probability instead

of a score, to know this per pixel instead of per scanned area, and to be considerate of

areas scanned at the same center but at multiple scales. The next subsection details how

the scores obtained from incomplete detections are integrated over scale and space to

yield the prior probability.

Advanced Concepts for Intelligent Vision Systems

Search WWH ::

Custom Search

Home