Database Reference
In-Depth Information
In on-line image searches, given a query image, we can interpret the descriptor
vectors of the image in a similar way to the indexing procedure, and accumulate
scores for the images in the database with a so-called term frequency-inverse
document frequency (tf-idf) scheme [ 126 ]. This tf-idf method is an effective entropy
weighting for indexing a scalable database. Figure 4.4 shows the computation of
image similarity based on the tf-idf scheme. In the vocabulary tree, each leaf node
corresponds to a visualword i , associated with an inverted file (with the list of images
containing this visualword i ). Note that we only need to consider images d in the
database with the same visualwords as the query image q . This significantly reduces
the amount of images to be compared with respect to q . The similarity between an
image d and the query q is given by
2
2
s
(
q
,
d
)=
q
d
2
2
+
i
2
=
0 |
q i |
0 |
d i |
+
0 |
q i
d i |
(4.2)
i
|
d i =
|
q i =
i
|
q i =
0
,
d i =
where q and d denote the tf-idf feature vectors of the query q and image d in
the database, which are consisted of individual elements q i and d i ( i denotes the
i -th visualword in the vocabulary tree), respectively. q i and d i are the tf-idf value
for the i -th visualword in the query and the image, respectively. Mathematical
interpretations are given by
q i =
tf i q ·
id f i ,
(4.3)
d i =
tf i d ·
id f i .
(4.4)
In the above equation, the inverted document frequency id f i is formulated as
ln
, where N is the total number of images in the database, and N i is number
of images with the visualword i (i.e., the images whose descriptors are classified
into the leaf node i ).
The term frequency representations tf i q and tf i d are computed as the accumulated
counts of the visualword i in the query q and the database image d , respectively.
One simple means for the term frequency computation is to use the O-query as
the initial query without considering the pixels surrounding the “O”. This process
is equivalent to using “binary” weights of the term frequency t f i q : the weight is 1
inside “O”, and 0 outside “O”. A more descriptive and accurate computation is to
incorporate the context information (i.e., the surrounding pixels around the O-query)
in the vocabulary tree. We design a new representation of the term frequency t f i q for
the O-query. A “soft” weighting scheme is adopted to modulate the term frequency
by incorporating the image context outside the O-query, which was neglected in
the simple binary scheme. When quantizing descriptors in the CVT, the tf i q
(
N
/
N i )
of the
O-query for a particular query visualword i q is formulated as:
Search WWH ::




Custom Search