Information Technology Reference
In-Depth Information
where ∩ is a set intersection. It is worthwhile to point out that the relevant
image subset may not be known in some cases, such as Web image
retrieval.
The fundamental problem of a probabilistic image retrieval system is
what the probability is that an image is relevant with respect to the query
example.
The event r denotes that the image I is relevant when P Q ( I ) is true. The
question is then answered by estimating the probability of relevance P ( r | I, Q ).
According to the Probability Ranking Principle, the optimality is achieved
by ranking the images in order of their probability of relevance. One of the
difficulties of this model is that the underlying assumptions often do not
hold in practical applications, for example, the probability of relevance is not
correctly known in practice, for instance, the probability of relevance is esti-
mated based on the basis of whatever data is not always accurate.
The other model realizes a predicate using similarity. (See Lew [8] for a
comprehensive review on nonmetric similarities.) The relevance is estimated
by the similarity. Each image is represented as multiple feature vectors in
high-dimensional feature spaces and the similarity of images in each feature
space is estimated by a distance in the space. One of the difficulties of this
model is that it is not obvious how to measure the visual similarity of images
from feature vectors, for instance, the common metrics that measure most of
our physical spaces are not consistent to the perceived similarity of image
content.
The similarity is then defined either by the probability of relevance P ( r | I, Q )
or the distance of feature vectors D ( d 1 , d 2 , ... ,d L ).
In both cases, the problem is transformed to a ranking problem by either
the probability of relevance or the similarity. The top ranked images are out-
puts of the retrieval. The Equation 11.17 can be expressed as
CI
=∈
{
CSIQ
:
(, )
>
t
},
(11.20)
Qi
s
where t s is a threshold.
In the case where the relevant image set is unknown, another applicable
performance measure is the retrieval rate defined as number of relevant
images in top n images
= number of relevant images in top images .
n
RR
(11.21)
n
How to assert the question { I i ∈  C : r Q ( I i ) is true} is a nontrivial problem.
The practical approach in similarity-based image retrieval is to define the
similarity as the visual similarity of resultant images to the example image
or images. In turn, the visual similarity is measured by a dissimilarity met-
ric of visual features in practice, which is also conveniently called a distance
 
Search WWH ::




Custom Search