Information Technology Reference
In-Depth Information
problems, we present a ranking optimization scheme which intuitively considers the
user tagging behaviors and addresses the issues of missing tags and noisy tags.
We note that only the qualitative difference is important and fitting to the numer-
ical values of 1 and 0 is unnecessary. Therefore, instead of solving an point-wise
classification task, we formulate it as a ranking problem which uses tag pairs within
each user-image combination
(
u
,
i
)
as the training data and optimizes for correct
t + )>
t )
indicates that user u considers tag t +
ranking. For example, y
(
u
,
i
,
y
(
u
,
i
,
is better to describe image i than tag t .
We provide some notations for easy explanation. Each user-image combination
(
u
,
i
)
is defined as a post . The set of observed posts is denoted as
P O
:
P O = (
1
u
,
i
) |∃
t
∈ T ,
y u , i , t =
(2.7)
The neutral triplets constitute a set
M
:
M = (
) ∈ P O
u
,
i
,
t
) | (
u
,
i
(2.8)
It is arbitrary to treat the neutral triplets as either positive or negative and we remove
all the triplets in
M
from the learning process (filled by bold question marks in
Fig. 2.2 b).
For the training pair determination, we consider two characteristics of the user
tagging behaviors. On one hand, some concepts may be missing in the user-generated
tags. We assume that the tags co-occurring frequently are likely to appear in the same
image (we call it context - relevant ). On the other hand, users will not bother to use all
the relevant tags to describe the image. The tags semantic - relevant with the observed
tags are also the potential good descriptions for the image. The two assumptions are
reasonable. Looking at the running example, user1 annotated image1 with tag3 (we
assume tag3 is to describeNemo, e.g., tag3
“fish”).We can see that the tags “water,”
“sea,” and “coral” which are context-relevant and “animal”, “seafish,” “clownfish”
which are semantic-relevant with the tag “fish” are all good descriptions for image1 .
To perform the idea, we build a tag-affinity graph W T based on tag semantic and
context intrarelations. 5 The tags with the k -highest affinity values are considered
semantic-relevant or context-relevant.
Regarding the possible noises in the user-generated tags, it is risky to enrich
the semantic- or context-relevant tags into the positive set. Therefore, we choose a
conservative strategy: we keep the unobserved tags semantic- irrelevant and context-
irrelevant with any of the observed tags, to form the negative tag set. Note that the
ranking optimization is performed over each post andwithin each post
=
(
u
,
i
)
a positive
T u , i and a negative tag set
T u , i are desired to construct the training pairs. Given
tag set
a post
, the observed tags constitute a positive tag set (the corresponding
triplets are filled by plus signs in Fig. 2.2 b):
T u , i = t
(
u
,
i
) ∈ P O
1
| (
u
,
i
) ∈ P O
y u , i , t =
(2.9)
5 Detail of W T construction is introduced in next subsection.
Search WWH ::




Custom Search