Database Reference
In-Depth Information
x q
x c )
exp
=
(
where A
, is the normalization constant, x q denotes the feature
vector of a query, and x c denotes a candidate image. When the likelihood is
calculated using the L1 norm, the corresponding negative distance function should
be substituted into the exponential function because the similarity is a decreasing
function of the distance between features.
2.5.2.2
Using the Support Vector Machine Active Learning
(SVMAL) Method
SVM is a powerful tool for pattern recognition because it maximizes the minimum
distance between the decision hyperplane and the training samples so as to minimize
the generalization error. Given training samples
N
i = 1 , where x i R
P
{ (
x i ,
y i ) }
,
y i
{−
1
,
1
}
is the ground-truth label of x i , the optimal hyperplane can be represented
N
i =
as f
ʱ i is the
Lagrangian multiplier, and b is the bias. Due to the sparse sample problem of
the relevance feedback leaning, the active learning method was introduced into
the learning process, whereby the most informative images are shown to request
user-provided labeling, resulting in the support vector machine active learning
(SVMAL)-CBIR [ 49 ]. Since the output of an SVM with respect to a sample is
the oriented distance from the sample to the hyperplane, the value could be either
positive or negative. Therefore, the exponential function is employed again to
convert the value of the discriminant function. When selecting radial basis functions
as the kernel, we obtain
(
x
)=
ʱ i y i K
(
x i ,
x
)+
b where K
(
x i ,
x
)
is the kernel function,
1
A exp N
i = 1 ʱ i y i K ( x i , x q )+ b
1
A exp
1
p
(
x q |
c
)=
(
f c (
x q )) =
(2.85)
= exp
i
where A
(
ʱ i y i K
(
x i ,
x q )+
b
)
is the normalization constant.
=
1
2.5.3
Context Model in Long-Term Learning
This part aims at calculating the P
in Eq. ( 2.82 ), which is the contextual
information about c inferred based on the I . Without I , the probability mass of c
is uniformly distributed over the class ensemble C without I . Due to the statistical
dependence across different classes, however, the distribution of c conditional on I
will deviate from the uniform distribution once I is available. As a result, the classes
that are more strongly correlated with I have higher probabilities than the others
do. Since the problem is essentially the estimation of a conditional probability mass
function (PMF), a typical train of thought leads to the conventional approach that
calculates the conditional probability through P
(
c
|
I
)
,forwhichwe
need a set of training samples belonging to the Cartesian product of
(
c
|
I
)=
P
(
c
,
I
) /
P
(
I
)
|
I
| +
1 C 's.
 
Search WWH ::




Custom Search