Information Technology Reference
In-Depth Information
bold lower-case letters indicate vectors. Any column vector x : p × 1 when presented as
a row vector will be denoted by x :1 × p . The following symbols are used extensively
throughout the text:
n
number of samples
p
number of variables
K
number of groups or classes into which the samples are divided
m
min( p , K 1)
X : n × p
a data matrix with n samples measured on p variables. Unless stated
otherwise, the matrix X is assumed to be centred to have column
means equal to zero.
G
an indicator matrix, usually with n rows, where each row consists of
zeros except for a one in the column associated with that
particular sample
diagonal matrix of the group sizes, N = ( G G ) 1
N
n
diag( N )
matrix of group means, X = N 1 G X
X : K × p
I
identity matrix, size determined by context
I r
0 : r × ( p r )
J : p
×
p
0 : ( p r ) × r
0 : ( p r ) × ( p r )
1
column vector of ones, size determined by context
d ij
the distance between sample i and sample j
δ ij
the fitted distance between sample i and sample j
D : n × n
a matrix derived from the pairwise distances of all n samples with
ij th element
1
2 d ij . The latter quantities are termed ddistances .
diag ( A : p × p )
the p × p diagonal matrix formed by replacing all the off-diagonal
elements of A with zeros; or, depending on the context, the
p -vector consisting of the diagonal elements of A
diag( a )
a diagonal matrix with the elements of the vector a on the diagonal
R
diagonal matrix of row totals
C
diagonal matrix of column totals
R11 C / n
E
tr ( AA )
2
|| A ||
A B
elementwise multiplication
A/B
elementwise division
The notion of distance is discussed in Chapter 5. Here we mention two concepts
which the reader will need throughout the topic. Pythagorean distance is the ordinary
Euclidean distance between two samples x i and x j with
p
d ij
2
=
1 ( x ik x jk )
.
k
=
Any distance metric that can be embedded in a Euclidean space is termed Euclidean
embeddable .
Search WWH ::




Custom Search