Information Technology Reference
In-Depth Information
complex motion patterns, including space-time corners. The response func-
tion is given by
R =( I ∗ g ∗ h even )+( I ∗ g ∗ h odd )
(1)
G ( x, y, σ )
where
is the 2D Gaussian
kernel,applied only along the spatial dimensions, and h even and h odd are
a quadrature pair of the 1D Gabor filters applied only temporally. They
are defined as
denotes the convolution operation,
cos(2 πtω )exp( −t 2 2 )
h even ( t ; τ, σ )=
-
and
h odd ( t ; τ, σ )=
-sin (2 πtω )exp( −t 2 2 )
; ω =4 as suggested by the author.
For more implementation details, please refer to [6] as the feature detection
part is beyond the scope of this chapter.
2.3 Feature Description
Once the cuboid is extracted, it is described using the LBP-TOP descriptor,
which is an extension of LBP operator into the temporal domain. LBP has
originally been proposed for texture analysis and classification [14]. Recently,
it has been applied on face recognition [1] and facial expression recognition
[2, 17]. While the original LBP was only designed for static images, LBP-TOP
has been used for dynamic textures and facial expression recognition [22]. As
a video sequence can not only be seen as the usual stack of XY planes in the
temporal axis, but also as a stack of YT planes on X axis and as a stack of
XT planes on Y axis, we prove that a cuboid can be successfully described
with LBP-TOP for action recognition purposes.
2.4 Classification
Each video sequence is described as a histogram of space-time words occur-
rence which represents its signature. The dimension of the signature is equal
to the size of the codebook and the histogram of each videos is given as input
to the classifier (see Fig. 1). We chose to use non linear Support Vector Ma-
chines (SVM) with rbf kernel and the library libSVM [ ? ] was adopted. The
best parameters C and
were chosen doing a 5-fold cross validation in a grid
approach on the training data and one against one approach has been used
for multi-class classification.
γ
3 Feature Description: (CS)LBP-TOP and Its
Extensions
The original Local Binary Pattern (LBP) operator was introduced by Ojala
et al. [13] and was proved to be a powerful texture descriptor. In the original
version, the operator labels the pixels of an image by thresholding a 3×3
neighborhood region of each pixel with the center value and considering the
results as a binary number. The resulting 256-bin histogram of the computed
 
Search WWH ::




Custom Search