Personalized Information Access Using Semantic Knowledge - Smart Information Systems: Computational Intelligence for Real-Life Applications

Information Technology Reference

In-Depth Information

New user and large application

The second scenario is focused on a new user that joins a well-established recom-

mendation service, such as LastFM or Facebook. We want to see how the enrichment

approach works for new users in a big recommendation application which already

has a lot of users.

7.5.1 Datasets

Evaluation is performed using two datasets from Facebook and the LastFM collected

between January and September 2010. We extracted data from around 60,000 users

and kept the profiles that contain data about interests in music. For evaluation we

used all user profiles containing at least two music interests. Users from the Face-

book dataset expressed their interests by 'liking' an artist. Users in the LastFM dataset

showed their interests by listening to music, which is implicitly tracked information

from LastFM, and by actively 'favoring' artists. The resulting Facebook dataset con-

sists of 3,011 users and 14,516 liked music items. The LastFM set consists of 7,743

users and 11,333 favored music items. We only crawled user profile information, no

other data from Facebook, e.g., Facebook Open Graph 7 information, or data from

LastFM about similar artists is part of the user profile data. The user profiles only

contain the user name, the artist name, or music album name, and in the LastFM set

also the MusicBrainz ID. 8

The semantic information that is needed for our approach is retrieved from Free-

base. In our scenario, we make use of data from the music domain consisting of four

music entity types, namely Artists, Albums, Tracks , and Genres relations between

them. The relationship between artist and genres describes the genre in which an

artist works; the relationship between album and artists describes which artist can

be found on an album release, and finally the relationship between album and genre

defines a genre assignment for each album. The created dataset is schematically visu-

alized in Fig. 7.14 . Table 7.1 shows the number of edges and entities contained in the

dataset.

In order to analyze how semantic encyclopedic data can improve CF, we inter-

linked the semantic dataset retrieved from Freebase with LastFM and Facebook as

explained in Sect. 7.5.2 .

7.5.2 Interlinking User Profiles

The extracted Facebook and LastFM profiles are initially isolated, meaning that

there is no connection to the Freebase dataset. However, our approach requires

7

Search WWH ::

Custom Search

Home