Information Technology Reference
In-Depth Information
Figure 2. Schematic view of the proposed energy
envelope-based intranote segmentation
Figure 3. Original and smoothed envelopes of a
sax note for a value of eth=0.05 (top figure, solid
and dashed thin lines, respectively); selected
characteristic points are denoted with a square
within extremes of the second derivative of the
smoothed envelope
and env m is the smoothed envelope at step m .
1
env k
( )
env
( )
k
N
=
e
m
(1)
m
N
env
k
=
1
Starting from a low cut-off frequency f f0init, ,
this frequency is increased each smoothing step
until the error e m gets lower than a certain thresh-
old e th ., empirically selected. Then, we compute
the three first derivatives of the last smoothed
envelope. Frame positions and corresponding y-
values of second derivative extremes are stored.
Afterwards, these characteristic points are sorted
by the second derivative modulus, and the n high-
est positions are selected to build up the set of
characteristic points F . Of course, when the total
number of third derivative zero-crossings is less
than n , the set is F shortened.
Both note onset and offset are added as char-
acteristic points to the set F . The slope defined
by each pair of consecutive characteristic points
on the envelope is computed (2), where i and j
denote frame positions. A minimum slope dura-
tion (measured in frames) ∆fr is defined relative
to the note duration as the five per cent of the
note length N for excluding the possible too high
valued slopes near the note limits.
Finally, the two pairs of points defining, respec-
tively, the most positive and most negative slope
values from the remaining slopes after discarding
are extracted. The end of the attack segment f AE
is defined as the frame position corresponding
to second point of the maximum slope, while the
start of the release segment position f RB is defined
as the first point of the minimum slope. This is
stated in (3) and (4) and depicted in Figure 3.
,
(3)
f
=
j
s
=
s
=
max (
s
)
M
i
,
j
i j
,
AE
M
M
M
,
i =
(4)
s
=
s
=
min (
s
)
f
m
i
,
j
i j
,
RB
m
m
m
The attack is defined as the segment between
the note onset and the end of the most positive of
the computed slopes, while the release segment
env
( )
j
env
( )
i
(2)
i
,
j
F
i
≤ + ∆
j
fr is
,
=
m
m
such as
i
,
j
j
i
Search WWH ::




Custom Search