Correlation analysis - Statistical Signal Processing of Complex-Valued Data

Databases Reference

In-Depth Information

The techniques in this chapter are developed for random vectors using ensemble

averages. This assumes that the necessary second-order information is available, namely,

the correlation matrices of x and y and their cross-correlation matrix are known. However,

it is straightforward to apply these techniques to sample data, using sample correlation

matrices. Then M independent snapshots of ( x

y ) would be assembled into matrices

[ x 1 ,...,

x M ] and Y

[ y 1 ,...,

y M ], and the correlation matrices would be estimated

M − 1 YY H .

The structure of this chapter is as follows. In Section 4.1 , we look at the founda-

tions for measuring multivariate association between a pair of complex vectors, which

will lead to the introduction of the three correlation analysis techniques CCA, MLR,

and PLS. In Section 4.2 , we discuss their invariance properties. In particular, we show

that the diagonal cross-correlations

M − 1 XX H , S xy =

M − 1 XY H , and S yy =

as S xx =

produced by CCA, MLR, and PLS are max-

imal invariants under linear/linear, unitary/linear, and unitary/unitary transformation,

respectively, of x and y . In Section 4.3 , we introduce a few scalar-valued correlation

coefficients as different functions of the diagonal cross-correlations

{

k i }

{

k i }

, and show how

these coefficients can be interpreted.

An important feature of CCA, MLR, and PLS is that they all produce diago-

nal cross-correlations that have maximum spread in the sense of majorization (see

Appendix 3 for background on majorization). Therefore, any correlation coefficient that

is an increasing and Schur-convex function of

is maximized, for arbitrary rank

r . This allows assessment of correlation in a lower-dimensional subspace of dimen-

sion r . In Section 4.4 , we introduce the correlation spread as a measure that indi-

cates how much of the overall correlation can be compressed into a lower-dimensional

subspace.

Finally, in Section 4.5 , we present several generalized likelihood-ratio tests for the

correlation structure of complex Gaussian data, such as sphericity, independence within

one data set, and independence between two data sets. All these tests have natural

invariance properties, and the generalized likelihood ratio is a function of an appropriate

maximal invariant.

{

k i }

4.1

Foundations for measuring multivariate association between

two complex random vectors

The correlation coefficient between two scalar real zero-mean random variables u and

is defined as

√ Eu 2 √ E

R u v

√ R uu √ R vv .

ρ u v =

2 =

(4.1)

The correlation coefficient is a convenient measure for how closely u and

are related.

It satisfies

−

≤ ρ u v ≤

1. If

ρ u v =

0, then u and

are uncorrelated. If

| ρ u v |=

1, then u

is a linear function of

, or vice versa, with probability 1:

R u v

R vv v.

(4.2)

Statistical Signal Processing of Complex-Valued Data

Search WWH ::

Custom Search

Home