Geoscience Reference
In-Depth Information
different they are. The four highest level groups include two clusters of stream samples, a
single cluster of estuary samples, and a small cluster of waste water treatment plant (WTP)
samples that lie intermediate to the stream and estuary samples. Overall, the dendrogram
suggests that in June 2006, there were greater differences among stream sites than between
estuary and WTP sites. In addition, the analysis differentiates between the sites located at
the head of the Hansted stream system ( Figure 10.1 ) in catchments that are less impacted
by agriculture (R13, R14) compared to other sites (Stedmon et al., 2006 ). Trends in the full
data set will be explored in detail in the remainder of the chapter using further chemometric
techniques.
A range of clustering and other exploratory techniques have been used to visualize and
interpret DOM fluorescence data sets. Jiang ( 2008 ) used hierarchical cluster analysis to
investigate sources of DOM in the Bohai Sea of China. Nelson ( 2009 ) used distance meas-
ures, cluster analysis, and multidimensional scaling coupled with classical statistics to
examine how similarity in DOM compositions in montane lake chains and their connecting
streams related to landscape position and catchment characteristics. Brunsdon and Baker
( 2002 ) presented a new tool for exploring and visualizing EEMs, termed principal filters
analysis (PFA), that identifies periods of high variability in data sets consisting of fluores-
cence EEMs represented in a time series. They used this technique to identify three distinct
periods with different fluorescence characteristics in the development of a stalagmite dur-
ing the last 10,000 years (Holocene period).
A type of adaptive artificial neural network called a self-organizing map (SOM) has
been used to visualize and identify patterns in fluorescence EEMs from streams, reser-
voirs, and wastewaters (Bieroza et al., 2009 ). The technique may be considered a nonlinear
extension of PCA in which the loadings of the principal components are generalized from
straight lines to curves, which may be useful for data that have a strongly nonlinear under-
lying distribution. The data set is reduced to a series of low-dimensional maps in which
similar objects are clustered close together and dissimilar objects grouped further apart,
with distances between objects depicted qualitatively using a graduated color scheme (e.g.,
the “U-Map” method). Extensions to SOM, such as visualization-induced self- organizing
map (ViSOM), have been proposed to simplify visualization relative to traditional SOMs
(Yin, 2002 ).
Data sets consisting of fixed-wavelength scans or fixed-offset synchronous scans, with
each sample represented by a vector of data, have been examined using multivariate curve
resolution (MCR; Tauler et al., 1995 ; Antunes and Esteves Da Silva, 2005 ; Abbas et al.,
2006 ). This technique is also useful for, for example, sets of emission spectra or sets of
excitation spectra. The MCR technique attempts to explicitly recover the pure response
profiles explaining the chemical variance observed in multivariate matrices, and can there-
fore provide more physically interpretable results than some other exploratory methods,
such as PCA. As in PCA, the MCR model provides scores and loadings, but in MCR the
score and loading vectors are not required to be orthogonal. On the contrary, they are often
required to be nonnegative. This leads to scores that seem to be estimates of concentrations
and loadings that seem to estimate spectra. However, it is important to be careful not to
Search WWH ::




Custom Search