Digital Signal Processing Reference
In-Depth Information
document. These segments captured the spatial relationships among visual words.
Some good segments are sifted from bad ones for each discovered object class.
Verbeek et al. [ 52 ] proposed two aspect-based spatial field models by combin-
ing pLSA/LDA with Markov Random Fields (MRF). One is based on averaging
over forests of minimal spanning trees linking neighboring image regions. A tree-
structure prior is imposed to the object class labels Z j = {
z ji }
of image patches in
image j ,
exp
i ψ ( z ji , z j χ ( i ) )+ log θ j
P
(
Z j )
,
(3.11)
where
χ (
i
)
is the unique parent of patch i in the tree, and
ψ (
z ji ,
z j χ ( i ) )
is a pair-wise
potential,
˙
ψ (
z ji ,
z j χ ( i ) )= ρ
[
z ji =
z j χ ( i ) ] .
(3.12)
The other model applies an efficient chain-based Expectation Propagation
method for regular 8-neighbor Markov Random Fields. The prior over Z j is given by
exp
i i ψ ( z ji , z ji )+ log θ j
P
(
Z j )
,
(3.13)
i enumerates spatial neighbor patches i , i in image j . MRF captures the
local spatial dependence of image patches. These two models were trained using
either patch-level labels or image-level labels. Tested on 240 images of nine object
categories from the MSRC data set, when trained using patch-level labels, they
achieved object segmentation accuracy of 80
where i
.
2% and when trained using image-
level labels, the accuracy of 78
.
1% was achieved. The accuracies of pLSA were
78
0% respectively under these two settings. The similar idea was also
explored in [ 58 ] and a Dirichlet process mixture was introduced to automatically
learn the number of object classes from data. This framework was extended to
Conditional Random Field (CRF) [ 4 ] to integrated both local and global features in
the images [ 53 , 59 ].
Sudderth et al. [ 60 ] proposed a Transformed Dirichlet Process (TDP) model
to jointly solve the problem of scene classification and object segmentation. This
approach coupled topic models with spatial transformations and consistently ac-
counted for geometric constraints. The spatial relationships of different parts of
objects were explicitly modeled under a hierarchical Bayesian model. Cao et al.
[ 61 ] proposed a Spatially Coherent Latent Topic Model (Spatial-LTM) to simulta-
neously classify scene categories and segment objects. It oversegmented images into
regions of coherent latent topic model and coherent latent topic model was consid-
ered as visual words. It enforced the spatial coherency of the model by requiring that
only one single latent-topic was assigned to the image patches within each region.
.
5% and 74
.
Search WWH ::




Custom Search