Semantic Movie Recommendations - Smart Information Systems: Computational Intelligence for Real-Life Applications - page 130

Information Technology Reference

In-Depth Information

Movies

Users

Tags

Actors

Directors

Genres

Locations

Countries

Fig. 5.2 Our semantic movie dataset consists of six bipartite relationship set providing knowledge

about movies. The Movie - User relationship set describes the user preferences

the integration and the processing of these data. Unfortunately, the freely available

IMDb data lack personalized rating and usage information.

We obtain personalized movie preferences from the MovieLens dataset [ 13 ].

MovieLens is a recommender system and virtual community website that allows

users to create profiles and subsequently obtain movie recommendations. The

MovieLens dataset provides rating data including timestamps. Since the Movie-

Lens and the IMDb dataset have a large overlap in the set of considered movies, the

two datasets can be combined aggregating encyclopedic and rating-based knowledge.

The mapping is performed by computing concordant properties (e.g., title, elapsed

time, genres). A frequently used dataset combining data fromMovieLens and IMDb

is theHetRec dataset. The dataset has been created for the InternationalWorkshop on

Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011) 4

and can be retrieved from GroupLens. 5

We use the aggregation of the IMDb and the MovieLens dataset for creating

a semantic movie recommender system. The structure of the dataset is shown in

Fig. 5.2 . The central entity type is Movie . The entity movie is directly connected

with the entitiy types Actors , Directors , Genres , Tags , Locations , and

Countries . In general, the dataset can be seen as a multi-graph, supporting several

different edges between two nodes of the graph.

In addition to the content-based movie descriptions, the relationship Movies-

Users provides user ratings for movies. The user ratings are used for optimizing

and benchmarking the learned recommender strategies. For the evaluation of our

approach, we split the user profiles (obtained from MovieLens) based on a global

timestamp into a training set and a test set. We filter out user profiles having less

than ten entries in the training set or the test set. We handle the dataset as a collection

of bipartite relationship sets each consisting of undirected, equally weighted edges.

The size of the entity sets and edge sets used in the evaluation is shown in Table 5.1 .

4 http://ir.ii.uam.es/hetrec2011/ .

5 http://www.grouplens.org/node/462/ .

Next Page

Smart Information Systems: Computational Intelligence for Real-Life Applications

Search WWH ::

Custom Search

Home