Information Technology Reference
In-Depth Information
nodal numbering orders. Most network problems become computationally intractable
for more than a few hundred nodes.
We are interested in seeking useful metrics (functions of the C matrix) for the
description of the topology of large networks such as sub-nets of the internet which
might have from a hundred to a million nodes, and thus perhaps a trillion connection
matrix values. To be useful, the metrics must be (a) rapidly computable, (b)
intuitively meaningful, (c) should holistically summarize the underlying topology
with a few variables, and (d) ideally would offer meaningful expansions that would
provide increasing levels of topological detail. Mathematically, they should be (e)
invariant under the permutation group on node numbering. We are specifically
interested in the information flows of which originating node sends email or data to
which destination node; and we are not initially interested in the underlying physical
connectivity nor the path which the information traverses. Internet transmissions are
extremely dynamic and thus to achieve some form of continuity, we envision
constructing the C matrix with the summation of information transfers, over some
time window
) thus representing the time evolution of
the connection matrix. Given the number of connections, this problem resembles the
representation of a physical gas in terms of thermo dynamical variables (such as
temperature, volume, pressure, heat, and entropy). Generally, in such internet
environments there is no meaningful location or position metric and distance is not
usefully defined. As such pressure and volume, do not have a clear meaning without a
distance function. Nor is it clear that what equilibrium is being approached, if any,
and thus heat and temperature do not offer clear meanings. However, we suggest that
the concept of both Shannon and generalized Renyi entropies [1, 2] can be well
defined and summarize the order and disorder in the underlying topological structure.
Initially, how to define entropy on the connection matrix is not clear since both
Shannon and Renyi entropies are defined as the log of the sum of the powers of the
components of a vector, x i , representing probabilities: S = c log 2 (b(
δ
, surrounding a time t for C(t,
δ
x i a ) ) where
x i =
1 and where a, b, and c are constants. As such these entropies represent the disorder in
the underlying probability distribution. The disorder is a maximum with an even
probability distribution and is a minimum when all the probability is in one cell with
others having a value of zero. But the connection matrix columns or rows cannot be
used as probability distributions since the diagonal of C is totally arbitrary. Even if we
make some arbitrary choice of the diagonal values of C and normalize the columns, it
is not clear what underlying topological 'disorder' we are measuring. In this work,
we utilize our past work on the decomposition of the general linear group in order to
answer both of these objections and to gain insight into how one might define these
entropy metrics in useful ways that satisfy the requirements a-e above.
Σ
Σ
2 Background on Markov Lie Groups and Monoids
We had previously shown [3] that the transformations in the general linear group in n
dimensions, that are continuously connected to the identity, can be decomposed into
two Lie groups: (1) an n(n-1) dimensional 'Markov type' Lie group that is defined by
preserving the sum of the components of a vector, and (2) the n dimensional Abelian
Lie group, A(n), of scaling transformations of the coordinates. To construct the
Markov type Lie group, consider the k,l matrix element of a matrix L ij as a basis for n
Search WWH ::




Custom Search