Personalized Information Access Using Semantic Knowledge - Smart Information Systems: Computational Intelligence for Real-Life Applications

Information Technology Reference

In-Depth Information

7.4.1 Enriching User Profiles Based on Semantic Data

Our motivation is to cope with the cold start problem. Therefore, we use semantic

encyclopedic knowledge to extend small user profiles. Studies on Wikipedia, 6 as

an example for online encyclopedias, showed that the quality and the accuracy of

Wikipedia articles is on a high standard and hence a reliable information source [ 15 ].

Therefore, we follow the idea that semantic encyclopedic data is a good and “neutral”

source for enriching user profiles with knowledge not influenced by subjective opin-

ions or tastes. Enriching user profiles with items strongly related to the items already

present in the user profile, adds “synonyms” for the existing entities. A synonym in

this context means that we add interests to the user profile that are similar to already

expressed user tastes, e.g., adding an additional artist that is related to an artist in

the user profile. This is done to increase the overlap of the enriched user profile with

other profiles. Thus, it improves the similarity calculation, but does not change the

taste of the user.

7.4.1.1 Finding Related Items Based on Encyclopedic Data

Our approach for solving the complex problem of computing entities to enrich the

user profile uses link prediction methods on a semantic dataset to find important

related items to a given input set of items (e.g., a user profile). The link prediction

task describes the problem of inferring missing links in an observed graph that are

likelytoexist[ 32 , 38 ]. In our approach, we apply link prediction for the task of

finding edges between items in the semantic dataset and a set of given entities of a

user profile.

To compute related entities for a given set of input items, we determine the entities

best connected to the input entities already present in a user profile. In our scenario,

best connected from a set of input entities describes the items that can be reached by

several parallel paths each consisting of a small number of edges. The computation

of the related entities can be performed directly on the semantic dataset (“memory-

based”) or based on a simplified network model (“model-based”). The semantic

dataset is modeled as a network consisting of nodes representing the entities and

edges describing the relationship between the entities (see Table 7.1 ). For computing

entities closely related to a given user profile, we take all existing entries in the user

profile as a starting point and traverse the semantic network (“path based breadth-first

search”). Since an extensive search may require too many resources (CPU, RAM),

we introduce a parameter to control the search depth of our approach. In this work,

we use a maximum search depth of four, meaning that starting from the user profile

all nodes are considered that can be reached with four steps or less. All entities that

can be reached from entities in the user profile are weighted by the number of parallel

paths and by the number of edges for each path. The formulas for calculating the

path weights are shown in Fig. 7.10 . An entity is the more relevant the more parallel

Search WWH ::

Custom Search

Home