Information Technology Reference
In-Depth Information
A chemometric problem might be the defi nition of a relationship between
properties of interest (which is sometimes diffi cult to measure/estimate),
based on knowledge of other properties easily obtained and which affect
the property of interest (Otto, 1998). In order to obtain this kind of
relationship, sets of experiments are usually designed to cover the space of
the property/process being analyzed. The next step is building and validation
of a model by using multivariate regression or multivariate classifi cation
methods, depending on the purpose of the model (Lopes et al., 2004). As in
most empirical modeling techniques, chemometric models need to be fed
with large amounts of good data. The use of multiple response variables to
build models can result in the temptation to overfi t models and thus obtain
artifi cially optimistic results (Miller, 2005). Chemometrics include several
topics, such as DoE and information extraction methods (modeling,
classifi cation, and test of assumptions) (Roggo et al., 2007). There are
many reviews and textbooks on the chemometrics available (Lavine, 2000;
Otto, 1998; Brereton, 2003; Massart et al., 2003).
Conventional regression methods include multiple linear regression
(MLR), principle component regression (PCR), and partial least squares
(PLS) (Martens and Naes, 1996; Martens and Martens, 2001).
Classifi cation methods include discriminant linear analysis, principal
component analysis (PCA), factor analysis (FA), and cluster analysis
(CA) (Jolliffe, 1986). Non-linear techniques, such as neural networks and
other artifi cial intelligence methods, are also used for this purpose.
4.2.1 Classifi cation methods
Classifi cation methods are usually connected with qualitative analysis
(e.g. classifi cation of samples according to their spectra). Classifi cation
can be unsupervised or supervised. Unsupervised classifi cation of the
data is performed with no a priori knowledge of their properties. Data
are classifi ed in clusters, which then need to be explained. In supervised
classifi cation, a model is fi rst developed using the set of data with known
categories and then validated by comparison of classifi cation predictions
to true categories of the data subset that was previously omitted.
￿
￿
￿
Unsupervised classifi cation methods
One of the basic unsupervised multivariate data treatment methods is
PCA. It is a feature reduction method that is especially useful due to its
data visualization ability. PCA reduces the number of variables in an
 
Search WWH ::




Custom Search