Databases Reference
In-Depth Information
The techniques in this chapter are developed for random vectors using ensemble
averages. This assumes that the necessary second-order information is available, namely,
the correlation matrices of x and y and their cross-correlation matrix are known. However,
it is straightforward to apply these techniques to sample data, using sample correlation
matrices. Then M independent snapshots of ( x
,
y ) would be assembled into matrices
X
=
[ x 1 ,...,
x M ] and Y
=
[ y 1 ,...,
y M ], and the correlation matrices would be estimated
M 1 YY H .
The structure of this chapter is as follows. In Section 4.1 , we look at the founda-
tions for measuring multivariate association between a pair of complex vectors, which
will lead to the introduction of the three correlation analysis techniques CCA, MLR,
and PLS. In Section 4.2 , we discuss their invariance properties. In particular, we show
that the diagonal cross-correlations
M 1 XX H , S xy =
M 1 XY H , and S yy =
as S xx =
produced by CCA, MLR, and PLS are max-
imal invariants under linear/linear, unitary/linear, and unitary/unitary transformation,
respectively, of x and y . In Section 4.3 , we introduce a few scalar-valued correlation
coefficients as different functions of the diagonal cross-correlations
{
k i }
{
k i }
, and show how
these coefficients can be interpreted.
An important feature of CCA, MLR, and PLS is that they all produce diago-
nal cross-correlations that have maximum spread in the sense of majorization (see
Appendix 3 for background on majorization). Therefore, any correlation coefficient that
is an increasing and Schur-convex function of
is maximized, for arbitrary rank
r . This allows assessment of correlation in a lower-dimensional subspace of dimen-
sion r . In Section 4.4 , we introduce the correlation spread as a measure that indi-
cates how much of the overall correlation can be compressed into a lower-dimensional
subspace.
Finally, in Section 4.5 , we present several generalized likelihood-ratio tests for the
correlation structure of complex Gaussian data, such as sphericity, independence within
one data set, and independence between two data sets. All these tests have natural
invariance properties, and the generalized likelihood ratio is a function of an appropriate
maximal invariant.
{
k i }
4.1
Foundations for measuring multivariate association between
two complex random vectors
The correlation coefficient between two scalar real zero-mean random variables u and
v
is defined as
v
Eu 2 E
Eu
R u v
R uu R vv .
ρ u v =
2 =
(4.1)
v
The correlation coefficient is a convenient measure for how closely u and
v
are related.
It satisfies
1
ρ u v
1. If
ρ u v =
0, then u and
v
are uncorrelated. If
| ρ u v |=
1, then u
is a linear function of
v
, or vice versa, with probability 1:
R u v
R vv v.
u
=
(4.2)
 
Search WWH ::




Custom Search