Geoscience Reference
In-Depth Information
1.3
Partial Least Squares Regression Method and Data
Processing
PLS is a new kind of multivariate statistics regression method, which was developed
by Herman Wold in 1966 (Li 2006 ). Comparing to other regression methods
(like PCA regression), PLS has many advantages, especially in resolving mutual
influence problems among variables. PLS has already been utilized for analyzing
material compositions from laboratory and remote sensing spectra datasets. Li
( 2006 , 2008 ) resampled LSCC bidirectional reflectance data into the airborne
visible/infrared imaging spectrometer (AVIRIS) spectral resolution and derived
several composition derivation models such as iron and TiO 2 with PLS regression
method (Li 2006 , 2008 , 2011 ). Li's model was based on laboratory data and was
not applied to remotely sensed data, making it difficult to evaluate the ability of the
model in maturity suppressing.
As an advanced statistical method, the principle of PLS analyzing can be
expressed as: PLS D PCA C CCA C MLR (CCA, classical component analysis;
MLR, multiple linear regression). The key to PLS modeling is to determine the
number of latent variables (LVs), which are also called the components. Covariance
between each corresponding component of independent variable and dependent
variable should be kept maximum; this can be considered as a combination of LVs
searching conditions of PCA and CCA.
Assuming the independent variance is an n m matrix X , and dependent variance
is an n p matrix Y , we first standardize matrixes X and Y before modeling in
order to reach a more stable result. Following PLS rules while regressing X and
Y , finally, we can get the relations listed below (Eqs. 1.1 and 1.2 ). Both X and Y are
decomposed into two parts: a matrix product term and a residual term. The matrix
product term consists of a score matrix and a loading matrix, score matrixes are T
for X and U for Y , and they are both n a matrixes; loading matrixes are P for X
and Q for Y , and they are both m a matrixes. E and F are residual matrixes. The
goal of regression is to find the correlative relation between X and Y (Eq. 1.3 ) while
keeping residual matrixes E and F minimum:
C E D X
a
X D TP T
t a p a
(1.1)
C F D X
a
Y D UQ T
u a q a
(1.2)
Y D XB C F
(1.3)
In order to find better relations between spectral data and iron content, we transfer
reflectance spectra into effective absorbance spectra first. With zero transmittance
given, absorbance can be roughly expressed as log reflectance based on Beer's Law
(Eq. 1.4 ), where R denotes reflectance and Ǜ is absorbance. The derived absorbance
Search WWH ::




Custom Search