Biology Reference
In-Depth Information
a typical node in the network. It has
h
k
i
nearest neighbors,
SUCCESSES AND FAILURES OF THE
ERD ˝ SeR ´ NYI MODEL
Biological Small Worlds
The Erd˝s e R´nyi network model is greatly oversimplified
and therefore overlooks many important features observed
in real biological networks. Nevertheless, this model does
prove successful in predicting the overall connectivity
observed is practically all analyzed biological network.
These networks all feature a giant connected component,
such that almost all pairs of nodes are connected by finite
network paths. Moreover, the average path length is found to
be consistent with Eq. (1) , so that the interacting nodes in the
network are typically just a few steps away fromone another,
meaning that cellular networks, like many other networks in
nature, feature the small world effect. For instance, in
metabolism it was found that most pairs of metabolites can
be linked by paths averaging approximately three edges in
length. These extremely short average path lengths are not
unique to any specific species: they were found in as many as
43 different species, ranging from the evolutionary reduced
metabolic network of parasitic bacteria to the highly
developed networks of large multicellular organisms [31] .
Similar, albeit less dramatic, results apply for protein and
genetic interaction networks, where the average path length
ranges from about four to eight edges [32 e 33] .
all at a distance of l
¼
1 from it; each of these neighbors
has on average
h
k
i
neighbors of its own, so that there are
2 nodes at a distance of l
roughly
2. Following this
logic, we find that the number of nodes within a given
distance from any typical node inflates exponentially,
leading to a logarithmic dependence of the average path
length on the network size, N [27]
h
k
i
¼
l z
ln N
ln
i :
(1)
h
k
This simplified argument overlooks the possibility
that some of the edges might be redundant, i.e., that some
of the links emerging from nodes at a distance l might
link to other nodes which are at the same distance. Still,
for large N it captures the behavior of the network at the
vicinity of a node, where the probability of linking to an
already used node, and thereby creating a loop, is very
small.
In a directed network, network paths are commonly
defined to propagate only in the direction of the edges. This
way the paths reflect the flow of information between the
nodes. The network distance between a pair of nodes is no
longer symmetrical, as displayed in Figure 9.1 (b). The
distance from node 1 to node 8 in the directed network is
l 1 ; 8 ¼
4 in the undirected version).
However, no path exists in the opposite direction, rendering
l 8 ; 1 infinite.
6 (as opposed to l 1 ; 8 ¼
Deviations from the Erd ˝ seR´nyi Model
In most cellular networks a tendency to form cliques is
observed, where the neighbors of one node tend to be
themselves connected. Thus the average clustering coeffi-
cient,
Clustering Coefficient
Certain networks show a tendency to form clusters of
interconnected nodes [28] . In such networks, if the nodes
y and z are both connected to some other node, x, it is likely
that there is also a direct link between y and z themselves.
To quantify this we denote the number of links connecting
x 's nearest neighbors to one another by c x ,. This number
can take values ranging from zero, in the case that no such
links exist, to k x ð
, of most cellular networks is significantly larger
than that of an equivalent Erd ˝ s e R ´ nyi network. By an
equivalent network, we refer to an Erd ˝ s e R ´ nyi network
with the same size and average degree. For instance, pro-
tein e protein interaction networks feature a clustering
coefficient which is typically about an order of magnitude
higher than that observed in their randomly rewired
equivalents [32] . Similar findings also characterize meta-
bolic networks [34 e 35] .
The emergence of high clustering provides the first hint
towards the recognition that the Erd˝s e R´nyi model cannot
account for the topological properties of realistic networks.
However, the most significant indication in that direction
comes from the degree distributions observed in actual
networks. In contrast to the Poisson degree distribution,
which is the fingerprint of the Erd˝s e R´nyi graph, cellular
networks consistently follow a power-law degree distribu-
tion [2,5,31,36 e 45] , predicting that the probability for
a randomly chosen node to have exactly k links is given by
h
C
i
2 in the case that all of the pairs
among x 's k x nearest neighbors are connected. The clus-
tering coefficient
k x
1
2c x
is thus defined as C x ¼
,
k x ð
k x
1
Þ
taking values which are between zero and unity [29 e 30] .
For instance, the clustering of node four in Figure 9.1 is
C 4 ¼
2
2
2. Averaging over all the nodes in the
network gives the average clustering coefficient
4 ¼
0
:
5
.For
an Erd ˝ s e R ´ nyi graph the probability of all pairs of nodes
being linked is uniform, regardless of whether they share
a mutual neighbor or not. The average clustering coefficient
is thus
h
C
i
h
C
i ER ¼
p, which, for a sparse graph, is typically
k g ;
ð
Þ w
(2)
P
k
very small.
Search WWH ::




Custom Search