User Modeling on Social Multimedia Activity - User-centric Social Multimedia Computing

Information Technology Reference

In-Depth Information

Table 3.5 Detailed concept list

Concept list

animals beach beauty bird bodypart topics building car cartoon cat celebrity child city cityscene

cloth cloud colors couple crowd dancing dark design dog drink electronic product family flight

flower food fruit geek goose grass grassland house icon indoor insect lake landscape leaf man

model mountain naturescene office painting palace party people performance photography portrait

poster puzzle road room sculpture sea sky snow soldier sport spot squirrel stadium stone store street

sunset talk text tiny plant tower toy transport tree universe watch waterfall woman

is manually constructed based on the observation of 88,988 downloaded post photos.

The reason we construct the concept list via manually defining instead of automati-

cally learning is two-fold: (1) By investigating into the post photos, we observe some

common concepts, e.g., people, animals, birds, snow, lake, and mountain. We can

aggregate the photos that describe a common concept for training; (2) The concepts

learned automatically capture less semantics and are difficult to interpret. We select

81 categories from the post photos to construct the concept list, which is shown in

Table 3.5 . Each concept category contains around 100 photos for training. We train 81

concept classifiers in a supervised manner. Dense HOG features [ 9 ] are extracted for

each image and Locality-constrained Linear Coding (LLC) [ 31 ] is utilized to obtain

the image representation. For each concept, we train a SVM classifier using LIB-

LINEAR [ 11 ]. The classification confidence is mapped to a probability score by the

sigmoid function. Therefore, each photo is finally represented as an 81-dimensional

vector corresponding to the concept probability score. Since a user may post more

than one photo, we apply a max-pooling method to aggregate multiple photo feature

vectors and finally obtain an 81-dimensional feature vector for each user, which is

referred as the post photo feature.

3.4.2 Stack SVM-Based User Attribute Inference

For user attribute inference, we learn predictive models based on the extracted user

features. Particularly, for each user attribute, we build six SVM classifiers based on

the six types of user features using LIBLINEAR [ 11 ], respectively.

To derive the attribute value from six classifiers, a fusion scheme is desired to

combine the six confidence scores. We employ a stacked model [ 32 ] to perform the

fusion. The key idea of stacked model is to learn a meta-level (level-1) classifier

based on the output of base-level (level-0) classifiers. Although simple, this scheme

well solved our problem of integrating multiple feature-base classifiers.

3.4.3 Exploring Attribute Relation for User Attribute Inference

It is intuitively recognized that user attribute correlates with each other for the same

individual. Therefore, we develop a structural discriminative model for incorporating

Search WWH ::

Custom Search

Home