Properties of Differential Entropy - Minimum Error Entropy Classification

Information Technology Reference

In-Depth Information

Appendix B

Properties of Differential Entropy

B.1 Shannon's Entropy

H S ( X )=

−

f ( x )ln f ( x ) dx =

− E

[ lnf ( X )] .

(B.1)

A list of important properties of Shannon's entropy [48, 184, 62] is:

1. H S ( X )

∈

]

−∞

, ln

],where

is the support length of X . The equal-

ity H S ( X )=

holds for a uniform distribution in a bounded support.

The minimum value (

) corresponds to a sequence of continuous Dirac- δ

functions (Dirac- δ comb).

2. Invariance to translations: H S ( X + c )= H S ( X ) for a constant c .

3. Change of scale: H S ( aX )= H S ( X )+ln

−∞

for a constant a .If X is a

random vector, H S ( A X )= H S ( X )+ln

det A |

4. Conditional entropy: H S ( X

Y )=

− E

[ln f ( X

Y )]

≤

H S ( X ) with equality

iff X and Y are independent.

5. Sub-additivity for joint distributions: H S ( X 1 ,...,X n )= i =1 H S ( X i |

X 1 ,

≤ i =1 H S ( X i ), with equality (additivity) only if the r.v.s are

independent.

6. Bijective transformation Y = ϕ ( X ): H S ( Y )= H S ( X )

...,X i− 1 )

−

E X [ln

J ϕ ( Y )

where J ϕ ( Y )= ∂ϕ − 1 ( y i )

∂y k

, i, k =1 ,...,d , is the Jacobian of the transfor-

mation. Note that this implies properties 2 and 3, and also the invariance

under an orthonormal transformation Y = A X ,with

| A |

=1.

7. If a random vector X with support in

has covariance matrix Σ ,then

2 ln[(2 πe ) n

H S ( X )

≤

| Σ |

]. The equality holds for the multivariate Gaussian

distribution.

8. Let X 1 ,...,X n be independent r.v.s with densities and finite variances.

Then,

e 2 H S ( X 1 + ... + X n )

e 2 H S ( X i ) .

≥

(B.2)

i =1

Minimum Error Entropy Classification

Search WWH ::

Custom Search

Home