Information Technology Reference
In-Depth Information
Perhaps the most compelling claim from the LSI is that it allows an informa-
tion retrieval system to retrieve documents that share no words with the query
(Deerwester et al. 1990 ; Dumais 1995). Another potentially appealing feature is
that the underlying semantic space can be subject to geometric representations. For
example, one can project the semantic space into a Euclidean space for a 2D or 3D
visualization. On the other hand, large complex semantic spaces in practice may not
always fit into low-dimension spaces comfortably.
3.2.2
Pathfinder Network Scaling
Pathfinder network scaling is a method originally developed by cognitive psy-
chologists for structuring modeling (Schvaneveldt et al. 1989 ). It relies on a
triangle inequality condition to select the most salient relations from proximity data.
Pathfinder networks (PFNETs) have the same set of vertices as the original graph.
The number of edges in a Pathfinder network, on the other hand, can be largely
reduced.
The notion of semantic similarity has been a long-standing theme in charac-
terising sementic structures, including Multidimensional Scaling (Kruskal 1977 ),
Pathfinder (Schvaneveldt et al. 1989 ) and Latent Semantic Indexing (Deerwester
et al. 1990 ). Triangular inequality is an important property of an Euclidean space,
which specifies that the distance between two points is less than or equal to the
distance of a path connecting the two points via a third point. Triangular inequality
is one of the key concepts in Pathfinder network scaling. Pathfinder network scaling
selects important links into the final network representation.
Similarly in Pathfinder, not only is there a triangle inequality to be compared
between a direct link and an alternative path through one other point, but also
between a direct link and all the possible routes connecting a given pair of points.
The maximum length of such routes is N-1. In terms of the metaphor of a traveling
salesman, he may choose to visit all the other cities before the final destination if
this extraordinary travel plan makes more sense to him than travel to the destination
directly. Semantically, if we can assign meanings to such travels, then the direct path
becomes pretty much redundant, and there is no need to consider such paths in later
analysis. This is the central idea to pathfinder network scaling.
Pathfinder network scaling relies on a criterion known as the triangle inequality
condition to select the most salient relations from proximity data. Results of
Pathfinder network scaling are called Pathfinder networks (PFNETs), consisting
of all the vertices from the original graph. The number of edges in a Pathfinder
network, however, is determined by the intrinsic structure of semantics. On the one
hand, a Pathfinder network with the least number of edges is identical to a minimum
spanning tree. On the other hand, additional edges in a Pathfinder network indicate
salient relationships that might have been missed from a minimum spanning tree
solution.
Search WWH ::




Custom Search