Information Technology Reference
In-Depth Information
In the case of metabolic networks, an arc can be imagined as an enzyme
catalyzing a chemical reaction transforming a metabolite into the other. The
adjacency matrix formalization was very useful in describing a lot of network
structures, and the literature adopting this notation is particularly rich. The
adjacency matrix allows to the development a straightforward metrics to compare
different metabolic networks. The same network module (like glycolysis, purine
metabolism, aminoacid biosynthesis...) gives rise to a peculiar adjacency matrix
for each organism.
All these matrices have the same set of rows and columns corresponding to
the maximal coverage of the whole set of intervening metabolites (it is enough
that a given metabolite is present in a single network to allow inclusion). The
distance between each pair of networks will be simply set to the Hamming
distance between the two networks, i.e. to the number of discrepancies (1vs.0 or
0vs.1) scored in the corresponding elements of the two networks.
In order to make the metrics independent of the number of analyzed variables
(metabolites), we divide the sum of the discrepancies by the total number of
variables (maximal attainable distance) and multiply the ratio by 100. Thus, we
obtain a 'percentage of dissimilarity' ranging from 0 (complete equivalence of
the two networks) to 100.
It is worth noting that the application of Hamming distance is made possible
by the peculiar character of metabolic network in which an edge has in any
instance the same meaning, namely that 'metabolite i can be transformed into
metabolite j ', self-catalytic cycles do not exist, and the great majority of
metabolites is shared among the different organisms.
This kind of metrics should be unfeasible for other more complex networks as
gene regulation or protein-protein interaction networks. The above metrics, when
applied to a set of n different networks will end into a symmetric n x n
dissimilarity matrix conveying all the information linked to the pairwise
similarities between the corresponding organisms in terms of the metabolic
module analyzed.
Being the dissimilarity matrix fully quantitative, it can be analyzed by means
of the whole range of multidimensional statistical techniques (multidimensional
scaling, principal component analysis) as well as to be the basis for the
construction of similarity trees. These can be considered as 'classification' trees
analog to those based upon anyother biological character amenable of a given
metrics. The idea of metabolic pathway comparisons was exploited by many
authors starting from the pioneering work of Dandekar and colleagues to other
recent publications. The method we present here is simpler than the above
Search WWH ::




Custom Search