Graphics Reference
In-Depth Information
task, defined by the set of hidden states, the non-stationary (time-dependent) tran-
sition probabilities, the probabilities of observing the visible state given a certain
hidden state, and the initial probabilities of each state. A set of reference trajecto-
ries is assigned to each hidden state based on the associated working action. Our
approach relies on a small number of reference trajectories which are defined by
manually labelled training sequences.
Besides an estimate of the class to which the observed trajectory belongs, the
trajectory classifier also yields the phase of the working action, i.e. the fraction by
which it has been completed. While a working action is performed, the estimated
phase governs the non-stationary transition probabilities, which are assumed to in-
crease with increasing phase.
The evaluation of the action recognition stage relies on 20 trinocular real-world
test sequences acquired from different viewpoints. The sequences contain work-
ing actions performed by eight different test persons in front of a complex clut-
tered working environment. The distance of the test persons to the camera system
amounts to 2 . 2-3 . 3 m. For training, only two sequences in which the working ac-
tions are performed by two different well-trained individuals (teachers) were used.
The teacher-based approach is motivated by the application scenario, in which work-
ers are generally trained by only a few experts. Ground truth labels were assigned
manually to all images of the training and test sequences.
It is demonstrated by Hahn et al. ( 2009 , 2010b ) that the system achieves an av-
erage action recognition rate of more than 90 % on the test sequences independent
of whether the combination of MOCCD and shape flow algorithm or the mean-shift
tracking approach is used. The average word error rate, which is defined as the sum
of insertions, deletions, and substitutions, divided by the total number of test pat-
terns, amounts to less than 10 %. Beyond the recognition of working actions, the
system is able to recognise disturbances, which occur e.g. when the worker inter-
rupts the sequence of working actions by blowing his nose. The system then enters
the safety mode and returns to the regular mode as soon as the working actions con-
tinue. On the average, the system recognises the working actions with a temporal
offset of several tenths of a second when compared to the manually defined begin-
ning of an action (which is not necessarily provided at high accuracy), where the
standard deviations are always larger than or comparable to the mean values.
Search WWH ::




Custom Search