Mobile Landmark Recognition - Multimedia Database Retrieval: Technology and Applications

Database Reference

In-Depth Information

i = 1 h i , h i

D d (

I x

I q

) ≈

(5.12)

h i = s x j |

Q s x j =

v i

(5.13)

v i

Q s j

h i =

s j |

(5.14)

(

where Q

v i denotes the quantization of descriptor s j into the codeword v i , and

h i denotes the number of local descriptors of image I x that fall into v i . Therefore,

the ranking loss

s j

can be approximated by:

x = 1

i = 1 h i , h i

L ≈

(

)

(5.15)

The BoW component in Eq. ( 5.13 )[orEq.( 5.14 )] estimates the distribution

of codewords in an image by assuming each descriptor to be equally important.

However, the foreground object in an image is usually more important than the

cluttered background, and thus they should be assigned different weights. Therefore,

saliency weighting is used in the calculation of the BoW components, as follows:

h i = |

s j |

(

s j )=

v i |

(5.16)

N d

j = 1 w j I ( i = arg min v ( v − s j )) v ∈T

(5.17)

N d

j = 1 w j

∑

where w

= {

w 1 ,...,

w j ,...,

w N d }

is the saliency weight of a landmark image, I

( · )

the indicator function, and N d is the number of local descriptors in the image.

The component h i corresponds to the Term Frequency (TF) of v i [ 162 ]. Thus, the

Inverted Term Frequency (ITF) can also be obtained to make a complete weighting

scheme [ 149 ]. As a result, the similarity between the query image I q and the image

I x in the database can be obtained by:

h q ·

h x ·

h q ·

f −

D d (

I x ,

I q )=

−

(5.18)

h x ·

h 1 ,...,

h q M ]

t are the weighted BoW vectors of the

query image I q and the image I x , respectively. f is the ITF vector [ 149 ], calculated as:

h 1 ,...,

h x M ]

where h q =[

and h x =[

log N

N v 1

log N

N v i

log N

N v M

,...,

(5.19)

Multimedia Database Retrieval: Technology and Applications

Search WWH ::

Custom Search

Home