Database Reference
In-Depth Information
or zero density regions, which is undesired. The SOTM by contrast has located all
clusters efficiently at this point.
In the SOFM 4
4 case, sufficient nodes have resulted in the mapping across
the entire dataspace, however the distortion and zero density nodes still remain.
The SOTM, in allocating nodes to outlying regions of low density, does exhibit
some limitations in the 16 node case, although with minimal impact on the integrity
of the main clusters. The SOTM becomes more sensitive to outliers, once all the
natural clusters have been located (see node 10). Generally these nodes will track
back to flesh out and subdivide larger, more dense clusters, however as competition
increases over already limited space, this becomes more difficult: one might imagine
a situation in which data has more noisy clusters.
×
3.2.3
Pseudo Labeling
Figure 3.3 summarizes the application of SOTM for pseudo labeling in an adaptive
retrieval system. The retrieval process occurs in the following steps. First, the
system obtains the retrieved samples, x 1 ,
x 2 ,...,
x N that are most similar to the query
x q based on feature space
F 1 . Second, these samples are associated with the
corresponding feature vectors, v 1 ,
v N , v i F 2 . These are input to the SOTM
for unsupervised learning. Third, after convergence, the output of SOTM is used for
labeling each v 1 ,
v 2 ,...,
N
i
v 2 ,...,
v N , resulting in the label set
{
y i }
1 . Finally, the labels are
=
associated with the retrieved samples, x 1 ,
x N , and used for the adaptation of
the relevance feedback module (i.e., the RBF-based relevance feedback).
Let w j ,
x 2 ,...,
L denote the weight vectors of the SOTM algorithm after
the convergence, where L is the total number of nodes. Also, let v q F 2 be the
feature vector associated with a given query image in the current retrieval session.
Thus, the distance between the query to all nodes can be obtained by:
j
=
1
,
2
, ···,
v q
w j ,
d
(
v q ,
w j )=
j
=
1
,
2
,···,
L
(3.8)
It follows that the K-nearest neighbors of the query is obtained by:
v q )= w
w k )
S k (
|
d
(
v q ,
w j )
d
(
v q ,
(3.9)
where S k (
is the set of nearest neighbors, and w k is the k -th nearest neighbor
of v q . All nodes in this set are relevant to the query vector. The assignment of
labeling to the retrieved sample, v i ,
v q )
is firstly conducted by calculating
the Euclidean distance between the sample and all nodes w j ,
i
∈{
1
,
N
}
j
=
1
,
2
, ···,
L
.
v i
w j ,
d
(
v i ,
w j )=
j
=
1
,
2
,···,
L
(3.10)
Search WWH ::




Custom Search