Database Reference
In-Depth Information
Despite the simplicity of the data set, there are a large number of graphs that
can be generated. The key questions to be answered during the original investi-
gation related to spatial patterns in species assemblages, and the relationships
of any such patterns to contamination (heavy metal concentrations).
Spatial patterns in species assemblages can be explored using sites as nodes,
and edges generated on the basis of species attribute data. To create this graph,
we needed only to select site name as entities, and species id as attributes
in the graphical interface. Both of these variables were recognised by the data
dictionary as categorical, and so no discretisation was needed. An edge weighting
function suitable for species data was selected. This function is based on the
Bray-Curtis dissimilarity, which is commonly used with ecological data:
|x ik − x jk |
(
w ij =1
x jk ) ,
(1)
x ik +
k
where
w ij denotes the weight of the edge from node
i
to node
j
,and
x ik denotes
the
.
The resultant graph is shown in Fig. 1. Weak edges have been pruned, leaving
a core structure of two distinct clusters of sites: the left-hand cluster corresponds
to sites from O'Brien Bay; the right-hand cluster Brown Bay. This strong clus-
tering suggests that the species assemblages of the two bays are distinct. As well
as this broad two-cluster structure, the graph provides other information about
the species composition of the sites. Each cluster shows spatial autocorrelation
— that is, samples from a given site in a given bay are most similar to other
samples from the same site (e.g. BB3 nodes are generally linked to other BB3
nodes). The colouring of the nodes reflects the number of species within a site
(grey=low, black=high), and indicates that the contaminated Brown Bay sites
have less species diversity than the uncontaminated O'Brien Bay sites.
An alternative view of the data can be generated by swapping the definitions
for entity and attribute, giving a graph of species id nodes with edges cal-
culated on the basis of site id attribute data. Fig. 2 shows four snapshots of
this graph. These were captured during an interactive exploration of the graph,
during which weak edges were progressively removed from the graph. The se-
quence of graphs shows the emergence of two clusters of nodes within the graph,
and confirms the presence of two broad species assemblages. However, the most
commonly-observed species (darkest node colours) lie in the centre of the graph,
with two sets of less-commonly observed species on the left and right peripheries
of the graph. This indicates that the central species are seen across a range of
sites (and hence have links to the majority of species) whereas the species on
the peripheries of the graph are seen at restricted sets of sites. This may have
implications if we wish to characterise the environmental niches of species. We
can investigate further by interactively adjusting the visible neighbourhood of
the graph. Fig. 3a shows the same graph as Fig. 2b but focused on the Gam-
mIIA species node, and with only the immediate neighbours of that node made
visible. This species has direct links to only four other species, and was seen at
relatively few sites. This suggests that GammIIA might only be present in certain
k
th attribute of node
i
 
Search WWH ::




Custom Search