Digital Signal Processing Reference
In-Depth Information
Fig. 9.6
SAR-HMM as DBN
s t-3
s t-2
s t-1
s t
structure
v t-3
v t-2
v t-1
v t
R
2
v t =−
c r (
s t )v t r + η(
s t )
with
η N(η ;
0
(
s t )).
(9.26)
r
=
1
models variations from pure autoregression rather than an indepen-
dent additive noise process. The joint probability of a sequence of length T is
There,
η(
s t )
T
p
(
s 1 : T ,v 1 : T ) =
p
(v 1 |
s 1 )
p
(
s 1 )
p
(v t | v t R : t 1 ,
s t )
p
(
s t |
s t 1 ).
(9.27)
t
=
2
Figure 9.6 visualises the SAR-HMM as DBN structure. Switching of the different
AR models is forcedly 'slowed down' by introducing an constant K . The model then
needs to remain in a state for an integer multiple of time steps. This is needed, as
considerably more sample values usually exist than features on the frame level.
The EM algorithm can be used for learning of the AR parameters. Based on the
forward-backward algorithm (cf. Sect. 7.3.1 ) the distributions p
(
s t | v 1 : T )
are learnt.
The fact that an observation
v t depends on R predecessors makes the backward pass
more complicated than in the case of an HMM. A 'correction smoother' [ 53 ] can
thus be applied such that the backward pass calculates the posterior p
(
s t | v 1 : T )
by
'correcting' the forward pass's output.
9.3.3.2 Autoregressive Switching Linear Dynamical Systems
With the extension of the SAR-HMM to an AR-SLDS, a noise process can explicitly
be modelled [ 17 ]. The observed audio sample
v t of interest is then modelled as a
noisy version of a hidden clean sample that is obtained from the projection of a
hidden vector h t
with the dynamic properties of a LDS:
N η t
s t ) .
h t 1 + η t
η t
h t =
A
(
s t )
,
with
;
0
H (
(9.28)
describes the dynamics of the hidden variable that
depends on the state s t at time step t . A Gaussian distributed hidden 'innovation' vari-
able
The transition matrix A
(
s t )
η t
models variations from 'pure' linear state dynamics. As for
η t in Eq. ( 9.26 )
 
Search WWH ::




Custom Search