Digital Signal Processing Reference
In-Depth Information
M
q i
n i !
P M (
n 1 ,
n 2 ,...,
n M /
q 1 ,...,
q M ) =
N
!
with q i
priors
,
i = 1
M
n i
N
n i
=
N
and p i
=
(9.1)
i
=
1
· 2
n n
e n
Stirling formula gives n
!≈
·
π ·
n when n
→+∞
. We could then
observe that it converges to discrete version of Kullback-Leibler:
p i log p i
q i
M
1
N
Lim
log [ P M ]
=
=
K
(
p
,
q
)
(9.2)
N
→+∞
i
=
1
Based on variational approach, Donsker and Varadhan gave variational definition of
Kullback divergence:
Sup E p (φ)
e φ )
K
(
p
,
q
) =
log E q (
(9.3)
Consider:
log p
(ω)
φ(ω) =
q
(ω)
log
ω
log p
log E q (
e φ ) =
(ω)
p
(ω)
E p (φ)
p
(ω)
q
(ω)
q
(ω)
q
(ω)
ω
=
K
(
p
,
q
)
log
(
1
) =
K
(
p
,
q
)
This proves that the supremum over all
φ
is no smaller than the divergence.
E p log e φ
E q (
log q φ (ω)
q
log E q (
e φ ) =
E p (φ)
=
p
(ω)
e φ )
(ω)
ω
) E p (φ)
log E q (
e φ ) =
ω
e φ(ω)
q
(ω)
with q φ (ω) =
e φ(θ)
K
(
p
,
q
p
(ω)
θ
q
(θ)
log p (ω)
q φ (ω)
0 using the divergence inequality.
In the same context, link with “Large Deviation Theory” and Fenchel-Legendre
transform which gives that logarithm of generating function are dual to Kullback
Divergence. This relation is given by:
 
Search WWH ::




Custom Search