Information Technology Reference
In-Depth Information
MLR is used to estimate regression vector b :
b = ( X T X ) −1 X T y
[4.9]
If all x -variables are controlled, then discrete levels of each x -variable
can be selected so as to enforce orthogonality between them and their
derived interactions and squared terms. The matrix X T X then becomes a
diagonal matrix and b is easily calculated. When the x -variables are not
controlled or the number of x -variables exceeds the number of
experiments, co-linearity arises between the x -variables. The reader is
advised to compare data analysis techniques described in Chapter 3 on
DoE (Section 3.2.4).
Developed models are usually estimated by the least squares, whereby
the sum of squares of the differences between the actual and predicted
(by model) values for each sample in the data set is minimized:
e ̤ = y i − ̤
[4.10]
where residual error e ̤ i is the difference between the observed and
predicted values of y , y i , and ̤ , respectively. The regression equation is
estimated such that the total sum of squares (SST) can be partitioned into
components due to regression (SSR) and residuals (SSE):
[4.11]
[4.12]
[4.13]
￿
￿
￿
SST = SSR + SSE
[4.14]
The explanatory power of regression is summarized by the coeffi cient
of determination R 2 , calculated from the sum of squares terms:
[4.15]
The inclusion of variables in a model is dependent on their predictive
ability. Three modes of variables selection are forward, backward, and
stepwise. When the variable correlation reaches a certain value, it is kept
in the model (Martens and Naes, 1996). The forward stepwise method
 
Search WWH ::




Custom Search