Information Technology Reference
In-Depth Information
New user and large application
The second scenario is focused on a new user that joins a well-established recom-
mendation service, such as LastFM or Facebook. We want to see how the enrichment
approach works for new users in a big recommendation application which already
has a lot of users.
7.5.1 Datasets
Evaluation is performed using two datasets from Facebook and the LastFM collected
between January and September 2010. We extracted data from around 60,000 users
and kept the profiles that contain data about interests in music. For evaluation we
used all user profiles containing at least two music interests. Users from the Face-
book dataset expressed their interests by 'liking' an artist. Users in the LastFM dataset
showed their interests by listening to music, which is implicitly tracked information
from LastFM, and by actively 'favoring' artists. The resulting Facebook dataset con-
sists of 3,011 users and 14,516 liked music items. The LastFM set consists of 7,743
users and 11,333 favored music items. We only crawled user profile information, no
other data from Facebook, e.g., Facebook Open Graph 7 information, or data from
LastFM about similar artists is part of the user profile data. The user profiles only
contain the user name, the artist name, or music album name, and in the LastFM set
also the MusicBrainz ID. 8
The semantic information that is needed for our approach is retrieved from Free-
base. In our scenario, we make use of data from the music domain consisting of four
music entity types, namely Artists, Albums, Tracks , and Genres relations between
them. The relationship between artist and genres describes the genre in which an
artist works; the relationship between album and artists describes which artist can
be found on an album release, and finally the relationship between album and genre
defines a genre assignment for each album. The created dataset is schematically visu-
alized in Fig. 7.14 . Table 7.1 shows the number of edges and entities contained in the
dataset.
In order to analyze how semantic encyclopedic data can improve CF, we inter-
linked the semantic dataset retrieved from Freebase with LastFM and Facebook as
explained in Sect. 7.5.2 .
7.5.2 Interlinking User Profiles
The extracted Facebook and LastFM profiles are initially isolated, meaning that
there is no connection to the Freebase dataset. However, our approach requires
7
http://developers.facebook.com/docs/opengraph/ .
 
Search WWH ::




Custom Search