Kernel-Based Adaptive Image Retrieval Methods - Multimedia Database Retrieval: Technology and Applications

Database Reference

In-Depth Information

x q

x c )

exp

( −

−

where A

, is the normalization constant, x q denotes the feature

vector of a query, and x c denotes a candidate image. When the likelihood is

calculated using the L1 norm, the corresponding negative distance function should

be substituted into the exponential function because the similarity is a decreasing

function of the distance between features.

2.5.2.2

Using the Support Vector Machine Active Learning

(SVMAL) Method

SVM is a powerful tool for pattern recognition because it maximizes the minimum

distance between the decision hyperplane and the training samples so as to minimize

the generalization error. Given training samples

i = 1 , where x i ∈ R

{ (

x i ,

y i ) }

y i ∈

{−

}

is the ground-truth label of x i , the optimal hyperplane can be represented

i =

as f

ʱ i is the

Lagrangian multiplier, and b is the bias. Due to the sparse sample problem of

the relevance feedback leaning, the active learning method was introduced into

the learning process, whereby the most informative images are shown to request

user-provided labeling, resulting in the support vector machine active learning

(SVMAL)-CBIR [ 49 ]. Since the output of an SVM with respect to a sample is

the oriented distance from the sample to the hyperplane, the value could be either

positive or negative. Therefore, the exponential function is employed again to

convert the value of the discriminant function. When selecting radial basis functions

as the kernel, we obtain

(

)= ∑

ʱ i y i K

(

x i ,

b where K

(

x i ,

)

is the kernel function,

A exp N

i = 1 ʱ i y i K ( x i , x q )+ b

A exp

(

x q |

(

f c (

x q )) =

(2.85)

= exp

where A

( ∑

ʱ i y i K

(

x i ,

x q )+

)

is the normalization constant.

2.5.3

Context Model in Long-Term Learning

This part aims at calculating the P

in Eq. ( 2.82 ), which is the contextual

information about c inferred based on the I . Without I , the probability mass of c

is uniformly distributed over the class ensemble C without I . Due to the statistical

dependence across different classes, however, the distribution of c conditional on I

will deviate from the uniform distribution once I is available. As a result, the classes

that are more strongly correlated with I have higher probabilities than the others

do. Since the problem is essentially the estimation of a conditional probability mass

function (PMF), a typical train of thought leads to the conventional approach that

calculates the conditional probability through P

(

)

,forwhichwe

need a set of training samples belonging to the Cartesian product of

(

) /

(

)

| +

1 C 's.

Multimedia Database Retrieval: Technology and Applications

Search WWH ::

Custom Search

Home