Biomedical Engineering Reference
In-Depth Information
are spiked into the solvent used to dissolve freeze-dried plant material. The
reason for this is to ensure there is a constant amount of standard in each
sample so that instrumental response may be normalised. This avoids
amplitude-based errors such as instrument drift, sample dilution or
concentration. A KNIME workfl ow (Figure 4.14) can be created that
identifi es internal standards, averages the values and then divides each row
in the original data by the internal standard. Although KNIME has many
nodes for data manipulation, as yet there are none that allow mathematical
functions to be applied to rows or columns within a data set so a custom
R node (Labelled 'R-Snippet') can be used in order to do the division.
In cases where internal standards are not available several other
methods are possible, one of the most common being total signal
normalisation where each observation is divided by the total signal for
that observation. In this way dilution effects may be eliminated. As with
all normalisation methods, it is helpful to study replicate or pooled
samples to see the effect of normalisation. If correctly normalised these
samples should cluster into a tight group.
The R code for the Internal Standard node is shown below.
>intstd<-R[28,8:35] # get int_std row
>mdata<-R[,8:35] # get numerical part of data frame
>normalised<-sweep(as.matrix(mdata),2,as.matrix(intstd),'/')
>R<-cbind(R[1:7],normalised) # recombine ID's with data and output
>intstd<-R[28,8:35] # get int_std row
>mdata<-R[,8:35] # get numerical part of data frame
>normalised<-sweep(as.matrix(mdata),2,as.matrix(intstd),'/')
>R<-cbind(R[1:7],normalised) # recombine ID's with data and output
4.7 Open source software for
multivariate analysis
￿ ￿ ￿ ￿ ￿
Metabolomics data consist of very large numbers of variables and
relatively few observations. Such data are inherently co-linear, which leads
to the use of chemometric techniques that can handle highly correlated
data by using latent variable methods [34]. These methods [35] include
principal components analysis (PCA), principal components regression
(PCR), Projection to latent structures (PLS), PLS discriminant analysis
(PLS-DA), orthogonal PLS (OPLS®) [36, 43, 44], orthogonal PLS
discriminant analysis (OPLS-DA®) [37] and kernel OPLS (K-OPLS) [38].
Once the data have been formatted and normalised, it is commonly
analysed interactively in a commercial multivariate analysis package.
However, the world of open source does offer some multivariate
tools, mainly in the R language. There are several chemometrics packages
for R.
 
Search WWH ::




Custom Search