Information Technology Reference
In-Depth Information
7.4.1 Enriching User Profiles Based on Semantic Data
Our motivation is to cope with the cold start problem. Therefore, we use semantic
encyclopedic knowledge to extend small user profiles. Studies on Wikipedia, 6 as
an example for online encyclopedias, showed that the quality and the accuracy of
Wikipedia articles is on a high standard and hence a reliable information source [ 15 ].
Therefore, we follow the idea that semantic encyclopedic data is a good and “neutral”
source for enriching user profiles with knowledge not influenced by subjective opin-
ions or tastes. Enriching user profiles with items strongly related to the items already
present in the user profile, adds “synonyms” for the existing entities. A synonym in
this context means that we add interests to the user profile that are similar to already
expressed user tastes, e.g., adding an additional artist that is related to an artist in
the user profile. This is done to increase the overlap of the enriched user profile with
other profiles. Thus, it improves the similarity calculation, but does not change the
taste of the user.
7.4.1.1 Finding Related Items Based on Encyclopedic Data
Our approach for solving the complex problem of computing entities to enrich the
user profile uses link prediction methods on a semantic dataset to find important
related items to a given input set of items (e.g., a user profile). The link prediction
task describes the problem of inferring missing links in an observed graph that are
likelytoexist[ 32 , 38 ]. In our approach, we apply link prediction for the task of
finding edges between items in the semantic dataset and a set of given entities of a
user profile.
To compute related entities for a given set of input items, we determine the entities
best connected to the input entities already present in a user profile. In our scenario,
best connected from a set of input entities describes the items that can be reached by
several parallel paths each consisting of a small number of edges. The computation
of the related entities can be performed directly on the semantic dataset (“memory-
based”) or based on a simplified network model (“model-based”). The semantic
dataset is modeled as a network consisting of nodes representing the entities and
edges describing the relationship between the entities (see Table 7.1 ). For computing
entities closely related to a given user profile, we take all existing entries in the user
profile as a starting point and traverse the semantic network (“path based breadth-first
search”). Since an extensive search may require too many resources (CPU, RAM),
we introduce a parameter to control the search depth of our approach. In this work,
we use a maximum search depth of four, meaning that starting from the user profile
all nodes are considered that can be reached with four steps or less. All entities that
can be reached from entities in the user profile are weighted by the number of parallel
paths and by the number of edges for each path. The formulas for calculating the
path weights are shown in Fig. 7.10 . An entity is the more relevant the more parallel
 
Search WWH ::




Custom Search