Database Reference
In-Depth Information
Fig. 4.3 Illustration of user indicated “O” query, and the computation of principal components of
the query.
( μ x , μ y )
( x o , y o )
( x q , y q )
is the center of “O” query,
is a pixel on the “O” boundary, and
is a query pixel
Figure 4.3 shows the computation of principal components from the “O” query.
Once the principal components are identified, image contextual model for mobile
visual search is used to identify the object of interest indicated by the user.
The following two sections will introduce the algorithms used in the context-
aware visual search. Section 4.2.4 presents the context within the query image itself
using the BoW model. Section 4.2.5 discusses the context of searched images, by
considering their tagged GPS information and relationship to the user's current
location.
4.2.4
Context-Aware Visual Search Using the BoW Model
The visual intent recognition method is based on a retrieval scheme using the
BoW model with the vocabulary tree proposed by Nister et al. [ 126 ]. This method
provides a fast and scalable search mechanism and is suitable for large-scale and
expansible databases because of its hierarchical tree-structured indexing. Such a
method is adopted in the mobile domain, because the “O” gesture fits naturally to
provide a focused object selection for better recognition. Different from using the
entire image as visual query in [ 126 ], we have user-indicated ROI from the “O”
gesture (called “O-query”). We design a novel context-aware visual search method
in which a CVT is built to take the surrounding pixels around the O-query into
consideration. The CVT algorithm focuses on first building a visualwords codebook
Search WWH ::




Custom Search