Semantic Movie Recommendations - Smart Information Systems: Computational Intelligence for Real-Life Applications - page 133

Information Technology Reference

In-Depth Information

from the respective scenario, the node degree should be taken into account while

assigning the relatedness scores to semantic edges.

For figuring out adequate scaling models, the analysis of the node degree distrib-

ution usually is a good starting point for defining a dataset specific scaling strategy.

Typical scaling models for social network data are based on power law probability

distributions. The scaling is done using logarithmic or polynomial scaling functions.

Adequate parameters should be learned based on training data. In the experiments

conducted in our evaluations we found that the applied scaling models have a strong

impact on the recommendation quality [ 21 ]. Inadequate scaling models or missing

scaling models cannot be repaired in the later steps.

5.3.4.4 Path-Based Relatedness Models

Semantic datasets represent knowledge as a large graph consisting of nodes and

edges. In order to compute recommendations, a measure is needed for calculating the

relatedness between all node pairs taking into account the edges between the nodes.

In the previous sections, we explained how to assign relatedness scores to directly

connected nodes. In this section we focus on edge algebras allowing us to assign

relatedness scores also for node pairs connected by complex paths (characterized by

parallel edges and long edge sequences).

First, we define criteria according to which the score for complex path should be

computed [ 21 ]:

•

If two nodes are directly connected by exactly one edge, the relatedness of the

nodes is defined based on the edge weight.

•

Two nodes are the more semantically related the more parallel paths between the

nodes exist.

•

Two nodes are the less semantically related the longer the path between the nodes.

We analyze three different approaches for combining the edge weights. Our

approaches are based on the distance between nodes. We define the distance between

two nodes as the reciprocal of the relatedness score. Thus, great distance value result

Table 5.2 The table shows the formulas for calculating the path weights for (a) parallel edges and

(b) for a sequence of edges

Weighted path

Resistance distance

Shortest path

n

min

i

i = 0 w i

n

1

i = 0 w i

(a)

w 0

w 1

w =

w

=

0 w i

w

=

=

w n

i = 0 w i

i = 0 w i

i = 0 w i

n

n

n

w 0

w 1

w n

n

(b)

...

w

= γ

w

=

w

=

The discount factor

γ

ensures that short paths get a higher weighting than long paths

Next Page

Smart Information Systems: Computational Intelligence for Real-Life Applications

Search WWH ::

Custom Search

Home