Databases Reference
In-Depth Information
Let
t
A
and
t
B
be two terms. The
support
defined in the document collection
is as follows.
Definition 3.
Support
denotes to be the significance of associations of term
t
A
and term t
B
in a collection, that is,
Support(t
A
,
t
B
) = significance(t
A
,
t
B
,
T
r
)
It is obvious that the support evaluated by
tfidf
satisfies the a priori
condition.
3 Geometric Theory of Latten Semantic Space
The goal of this section is to model the internal semantic of a collection of
documents using a set of geometric and topologic notions, called simplicial
complex that is a special form of hypergraphs [15].
3.1 Simplicial Complex
Let us introduce and define some basic notions in combinatorial topology. The
central notion is n-simplex.
Definition 4.
A n-simplex is a set of independent abstract vertices
[
v
0
,...,
v
n
+1
]
.Ar-face of a n-simplex
[
v
0
,...,v
n
+1
]
is a r-simplex
[
v
j
0
,...,v
j
r
+1
]
whose vertices are a subset of
{
v
0
,...,v
n
+1
}
with cardinality r
+1
.
Geometrically 0-simplex is a vertex; 1-simplex is an open segment (
v
0
,v
1
)
that does not include its end points; 2-simplex is an open triangle (
v
0
,v
1
,v
2
)
that does not include its edges and vertices; 3-simplex is an open tetrahedron
(
v
0
,v
1
,v
2
,v
3
) that does not includes all the boundaries. For each simplex,
all its proper faces (boundaries) are not included. An
n
-simplex is the high
dimensional analogy of those low dimensional simplexes (segment, triangle,
and tetrahedron)in
n
-space. Geometrically, an
n
-simplex uniquely determines
a set of
n
+ 1 linearly independent vertices, and vice versa. An
n
-simplex is
the smallest convex set in a Euclidean space
R
n
that contains
n
+1 points
v
0
...
,
v
n
that do not lie in a hyperplane of dimension less than
n
. For example,
there is the standard
n
-simplex
δ
n
=
R
n
+1
{
(
t
0
,t
1
,...,t
n
+1
)
∈
|
t
i
=1
,t
i
≥
0
}
i
The convex hull of any
m
vertices of the
n
-simplex is called an
m
-face. The
0-faces are the vertices, the 1-faces are the edges, 2-faces are the triangles,
and the single
n
-face is the whole
n
-simplex itself. Formally,