Preliminaries - Recommender Systems and the Social Web

Databases Reference

In-Depth Information

are applied both on the textual descriptions of items (static data) and on the tagging data (dynamic

data) to build user profiles and learn user interests. The user profile consists of three parts: the static

content, the user's personal tags, and the social tags which build the collaborative part of the user profile.

Thus, in this work, tags are seen as an additional source of information used for learning the profile of a

particular user. The authors compare their tag-based approach with a pure content-based recommender

in a user study. The results show that the recommendations made by the tag-augmented recommender

are slightly more accurate than the recommendations of the pure content-based one.

In the study of [Firan et al., 2007] tags are also seen as content descriptors for different content-based

systems. Tags are used for building user profiles for the popular music community site Last.fm. In order

to address the cold start problem, the user profiles are inferred automatically, e.g., from the music tracks

available on the computer of each user, thus reducing the manual effort from the user's side to express his

or her preferences. The authors show that tag-based profiles can lead to better music recommendations

than conventional user profiles based on song and track usage.

In [Cantador et al., 2010] tags are considered as content features that describe both user and item

profiles. Cantador et al. propose weighting functions which assess the importance of a particular tag

for a given user or item, and similarity functions which compute the similarity between a user profile

and an item profile. These weighting and similarity functions are then combined in different content-

based recommendation models. User interests and item characteristics are modeled as vectors u m =

( u m, 1 , ...,u m,L )and i n =( i n, 1 ,..., i n,L ) of length L respectively, where L is the number of tags in the

folksonomy, u m,l is the number of times user u m has annotated items with tag t l ,and i n,l is the number

of times item i n has been annotated with tag t l . After modeling users and items as vectors accordingly,

the authors can adapt the well-known TF-IDF vector space model from information retrieval which

was described in Section 2.1.3. Besides this TF-IDF-based profile model, the authors also include a

pure TF-based profile model (without the IDF component) into the evaluation pool. Additionally, they

propose a profile model based on the Okapi BM25 weighting scheme which is a probabilistic framework to

rank documents according to a given query [Baeza-Yates and Ribeiro-Neto, 1999]. These profile models

are then exploited in a number of content-based recommendation approaches such as a TF-IDF cosine-

based recommendation approach which computes the similarity between a user and an item vector with

the cosine similarity measure, and a corresponding BM25 cosine-based recommendation approach. The

evaluation results on the Delicious and Last.fm data sets show that the recommendation models focusing

on user profiles outperform the models focusing on item profiles.

Tagging data can also be incorporated in search engines to personalize the search results. According

to [Pitkow et al., 2002], two basic approaches to Web search personalization can be differentiated. In the

first approach, a user's original query is modified and adapted to the needs of the user. For example, the

query “eclipse” might be extended to “eclipse software development environment” if we know that the

user has an interest in software development. In the second approach, the query is not modified, but the

returned list of search results is re-ranked according to the user profile.

An example for the latter approach is given by [Noll and Meinel, 2007]. The authors propose a

pure tag-based personalization method to re-rank the Web search results which is independent from the

underlying search engine. The basic idea is to use bookmarks and tagging data to re-rank the documents

in the search result list. Noll and Meinel also propose a concept called tagmarking which translates the

keywords in the search query to tags and assign them to the bookmarked Web page that is associated

with the query. Bookmarks and tags are aggregated in a binary tag-document matrix M d where each

column (vector) represents a bookmark of a document with its components set to 1 if the corresponding

tag is associated with the document and 0 otherwise. The user profile p u is modeled as a vector M d

ω d

where ω d is a vector which contains the weights assigned to each tag. The tag-user matrix M u and the

document profile p d are built analogously. Note that by defining ω d := 1 T and ω u := 1 T , the authors

assign equal importance to all tags and users. Finally, in the personalization step the documents are

re-ranked according to a similarity metric which combines both the user profile and the document profile.

Table 2.1 shows in an example from [Noll and Meinel, 2007], how personalization affects Google's result list

for the search query “security”. The ranking of the Web site of the US Social Security Administration

( ssa.gov ), for instance, has increased because according to the authors the user who submitted the

query also shows interest in insurance matters. In the evaluation phase the participants were asked in a

questionnaire which of the ranking lists of a query (the original list or the personalized list) they prefer.

Recommender Systems and the Social Web

Search WWH ::

Custom Search

Home