Information Technology Reference
In-Depth Information
Fig. 7.12 Path Length 4: Explanation of path-based enrichments over the Artist-Genre edge set.
The user can see the different nodes that were used for enrichment with Björk
path. Currently, the weight of the edge, which can be considered as the importance
of the edge, has to be set manually or by using normalization strategies. One strategy
is to weigh edges based on their significance to connect a node in the dataset. If
the edge is the only one connecting a node, determined by the degree of a node, it
is considered as more important than edges that connect a node with several other
edges. For parallel edges/paths the ratings are summed up. For a sequence of edges
the weights are multiplied and weighted by a discount factor (depending on the path
length). In our system, we implemented the path-based approach using a breadth-first
search algorithm with a limited search depth [ 34 ]. The search depth limit is set to
make sure that the computed results are relevant for the input items and not only
loosely connected. With the depth limit, no items are taken into account where the
path length to the most relevant item is longer than the defined search limit.
Another advantage of path-based approach is that no additional effort is needed
for building a model. Thus, updates in the dataset immediately affect the computed
results.
7.4.1.3 Model-Based Predictions
Real-world datasets are often sparse and noisy. In order to cope with these problems
we reduce the complexity of the dataset by aggregating similar entities into clusters.
To assure that users still understand computed recommendations, we use Hierarchical
Agglomerative Clustering [ 42 ] that combines entities with similar features in one
Search WWH ::




Custom Search