Chemometric methods application in pharmaceutical products and processes analysis and control - Computer-Aided Applications in Pharmaceutical Technology

Information Technology Reference

In-Depth Information

methods are target projection (TP), orthogonal PLS, etc. (Rajalahti and

Kvalheim, 2011).

Support Vector Machines

Support Vector Machines (SVM) are a group of supervised learning

algorithms, which can be used for classifi cation or regression purposes. The

SVM algorithm is based upon the statistical learning theory and the

Vapnik-Chervonenkis (VC) dimensions (Vapnik and Chervonenkis, 1974).

Standard SVM is a binary classifi er that separates inputs into two possible

outputs (classes). In contrast to previously described FA methods, where

dimensionality reduction enables fi nding of LVs, SVM algorithms are used

to defi ne a space of higher (even infi nite) dimensions - a hyperplane. For

classifi cation purposes, good separation can be achieved once the distance

between samples (in the hyperplane) belonging to different classes is large.

Samples that were not separable in the previous space may then be

distinguished in the newly created hyperplane (Roggo et al., 2010).

Construction of the higher dimensional space by SVM is based upon

defi nition of a kernel function K ( x,y ), which is applied on the data in the

original space (Press et al., 2007). Kernel functions normally used are linear,

polynomial, radial basis function (RBF), and sigmoidal, where the latter

makes the SVM algorithm equivalent to a two-layer perceptron neural

network (Section 5.1.2.1). RBF is the most often used kernel function, since

it can handle cases when the relation between the class labels (the target

values) and the attributes (the features of the training set) is nonlinear:

K ( x i x j ) = exp(−γ || x i − x j || 2 )

[4.18]

with y being a parameter that controls the width of the kernel function,

and x i and x j are the vectors of the i th and the j th training samples,

respectively.

SVMs are similar to neural networks, with the main difference being

the way in which the weights are adjusted during training. In SVMs,

weights are adjusted by solving a quadratic programming problem with

linear constraints. Independent (predictor) variables are denoted as

attributes, whereas the transformed attribute that is used to defi ne the

hyperplane is called a feature. The task of choosing the most suitable

representation is known as feature selection. A set of features that

describes one sample (i.e. a row of independent, predictor values) is

called a vector. Therefore, the goal of the SVM algorithm is to fi nd the

optimal hyperplane that separates clusters of vector in such a way that

cases with one category of the target variable are on one side of the plane

Search WWH ::

Custom Search

Home