Digital Signal Processing Reference
In-Depth Information
non-linguistic vocalisations [ 51 ] or the segmentation of meeting speech [ 52 ]. A par-
ticular strength is the possibility to use arbitrary functions for the observations without
complication of the parameter learning.
The HCRF models the conditional probability of a class c , given the sequence of
observations X
=
x 1 ,
x 2 ,...,
x T :
1
e λ f ( c , Seq , X ) ,
p
(
c
|
X
,λ) =
(9.24)
z
(
X
,λ)
Seq
c
where
λ
is the parameter vector and f the 'vector of sufficient statistics', and Seq
=
s 1 ,
s T is the hidden state sequence run through during the computation of
this conditional probability. The probability is normalised by the 'partition function'
z
s 2 ,...,
(
X
,λ)
to ensure a properly normalised probability [ 15 ]:
e λ f ( c , Seq , X ) .
z
(
X
,λ) =
(9.25)
c
Seq
c
The vector f determines the probability to model. With a suited f a left-right
HMM can be imitated [ 15 ]. Let us now now restrict the HCRF to a Markov chain,
but without the requirements of the transition probabilities to sum to one and real
probability densities for the observations. In analogy to a HMM a parametrisation
by transition scores a i , j and observation scores b j (
x t )
can then be reached with the
parameters
, where and i and j are states of the model (cf. Sect. 7.3.2 ) . Forward
and backward recursions (cf. Sect. 7.3.1 ) as for a HMM can then further be used.
λ
9.3.3 Audio Modelling in the Time Domain
Modelling of the raw signal in the time domain is a sparsely pursued option, but can
offer easy explicit noise modelling [ 16 ]. We will look at SAR-HMMs to this end
first, and then at the extension to SLDS.
9.3.3.1 Switching Autoregressive Hidden Markov Models
The SAR-HMM models the audio signal of interest as an autoregressive (AR)
process. The non-stationarity is realised by switching between different AR parame-
ter sets [ 17 ] by a discrete switch variable s t similar to the HMM states. At a time
step t —referring to the sample-level in this case—, exactly one out of S states is
occupied. The state at time step t depends exclusively on its predecessor with the
transition probability p
v t at this time step is assumed as a
linear combination of its R preceding samples superposed by a Gaussian distributed
'innovation'
(
s t |
s t 1 )
.Thesample
η(
s t )
.
η(
s t )
and the AR weights c r (
s t )
are the parameter set given by
the state s t :
 
 
Search WWH ::




Custom Search