Information Technology Reference
In-Depth Information
4
Subgroup Identification
When validating simulated networks generated from agent-based simulations, merely
looking at the average graph properties may be inadequate. Different generative me-
chanisms may lead to varying subgraph characteristics in the simulated network even
if their global characteristics may be similar. A comparison of global and subgraph
network characteristics is necessary when a generative mechanism is introduced into
agent-based simulations. Global properties of a given dynamic network may give
clues about the robustness of the underlying processes, whilst local properties may
show the variability that may occur for different settings for the same processes. Sub-
graph analysis may also be suitable for networks with changing ties and nodes.
The ratio of the number of particular subgroups in the 'real' or simulated network
(compared to their random network counterpart in terms of nodes and edges) is inde-
pendent of the network's size and thus is a good candidate for analyzing dynamical
networks when both the edge and node sets changes over time. The social network
analysis community has long known network substructures as 'fragments', especially,
'triads' for testing hypotheses such as balance, transitivity, and power [37, 15]. Milo
et al. [28], inspired from sociological foundations, introduced the concept of local
structures as statistically significant 'motifs', to study 'building blocks' of complex
networks. The process includes counting the number of occurrences for each
subgraph that appears in the network. This is followed by testing for statistical signi-
ficance for each subgraph by comparison with randomized networks. Faust [17] com-
pared networks of different sizes and domains w.r.t. their subgraph and lower-order
properties, i.e., the nodal degree distributions and dyad census.
Graph clusters, in a social context, are called communities we briefly look at three
approaches to their definition and detection.
Community Detection [31] . This is a technique to identify subgraphs where nodes
are closely linked, i.e., when there are more edges between the nodes of the subgraph
than external links to other nodes. There are many algorithms available to identify
such clusters, see e.g., [8, 9]. To identify how closely connected these communities
are, the concept of modularity was introduced [31], which is the fraction of the edges
that fall within the groups (or communities) minus the expected fraction under the
assumption that the edges were randomly distributed. It ranges from 0 to 1, the higher
the value the more cohesive the community. An average may be used for comparing
networks.
For graphs where some attributes of nodes are known (e.g., demographic data),
several techniques may be used to match the semantic structure of the graphs. These
can be used to identify which parameters may be more important (e.g., [1, 2]). Two
are affinity and the silo index, now discussed.
Affinity [29] . Affinity measures the ratio of the fraction of edges between attribute-
sharing nodes, relative to what would be expected if attributes were randomly distri-
buted. It ranges from 0 to infinity. Values greater than 1 indicate a positive correlation
whereas values between 0 and 1 have a negative correlation. For an attribute, say
Search WWH ::




Custom Search