Information Technology Reference
In-Depth Information
Fig. 2.1 The proposed solution framework. a Data Collection. b RMTF. c Tag Refinement. ©[2012]
IEEE. Reprinted, with permission, from Ref. [ 34 ]
which over 3 million users voluntarily help translate Facebook webpages into 60
different languages.
With the popularity of Web 2.0, there are explosive photo sharing websites with
large-scale image collections available online, such as Flickr, Pinterest, Instagram,
Picasa, etc. These Web 2.0 websites allow users as owners, taggers, or commenters
for their contributed images, leading to a huge amount of social images with user-
contributed tags. Obviously, given such a large-scale web dataset, noisy and missing
tags are inevitable, which limit the performance of social tag-based retrieval sys-
tem [ 5 , 22 ]. Therefore, the tag refinement to denoise and enrich tags for images
is desired to tackle this problem. Existing efforts on tag refinement [ 4 , 16 , 19 - 21 ,
40 , 42 , 47 ] exploited the semantic correlation between tags and visual similarity of
images to address the noisy and missing issues, while the user interaction as one of
important entities in the social tagging data is neglected.
The goal of this chapter is to introduce the user factor into social image tag analysis
tasks, and improve the underlying associations between the images and tags from the
observed raw tagging data. To this end, we address the tag refinement problem from
a factor analysis perspective and aim at building the user-aware image and tag factor
representations. With the user factor incorporated, the image and tag factors will be
free to focus on their own semantics and we can obtain more semantics-specified
image and tag representations. A novel method named Ranking-based Multicorrela-
tion Tensor Factorization (RMTF) is proposed to tackle the tag refinement task. The
framework is illustrated in Fig. 2.1 . 1 It contains three components: data collection,
RMTF, and tag refinement. For data collection, three types of data including users,
images, and tags as well as their ternary interrelations and intrarelations are collected.
In the RMTF module, we utilize tensor factorization to jointly model the multiple
factors. To make full use of the observed tagging data and partial use of unobserved
data, we present a novel ranking scheme for model estimation, which is based on
the pair-wise difference between positive examples (i.e., observed tagging data) and
negative ones (i.e., partial unobserved data). The collection of negative examples is
carried out by analyzing user tagging behavior. The issue of noisy tags and missing
1 We show a running example consisting of three users, five tags, and four images in Fig. 2.1 a.
Search WWH ::




Custom Search