Information Technology Reference
In-Depth Information
Figure 2. Schematic view of the proposed energy
envelope-based intranote segmentation
Figure 3. Original and smoothed envelopes of a
sax note for a value of eth=0.05 (top figure, solid
and dashed thin lines, respectively); selected
characteristic points are denoted with a square
within extremes of the second derivative of the
smoothed envelope
and
env
m
is the smoothed envelope at step
m
.
1
env k
( )
−
env
( )
k
N
=
∑
e
m
(1)
m
N
env
k
=
1
Starting from a low cut-off frequency
f
f0init,
,
this frequency is increased each smoothing step
until the error
e
m
gets lower than a certain thresh-
old
e
th
., empirically selected. Then, we compute
the three first derivatives of the last smoothed
envelope. Frame positions and corresponding y-
values of second derivative extremes are stored.
Afterwards, these characteristic points are sorted
by the second derivative modulus, and the
n
high-
est positions are selected to build up the set of
characteristic points
F
. Of course, when the total
number of third derivative zero-crossings is less
than
n
, the set is
F
shortened.
Both note onset and offset are added as char-
acteristic points to the set
F
. The slope defined
by each pair of consecutive characteristic points
on the envelope is computed (2), where
i
and
j
denote frame positions. A minimum slope dura-
tion (measured in frames)
∆fr
is defined relative
to the note duration as the five per cent of the
note length N for excluding the possible too high
valued slopes near the note limits.
Finally, the two pairs of points defining, respec-
tively, the most positive and most negative slope
values from the remaining slopes after discarding
are extracted. The end of the attack segment
f
AE
is defined as the frame position corresponding
to second point of the maximum slope, while the
start of the release segment position
f
RB
is defined
as the first point of the minimum slope. This is
stated in (3) and (4) and depicted in Figure 3.
,
(3)
f
=
j
s
=
s
=
max
(
s
)
M
i
,
j
i j
,
AE
M
M
M
,
i
=
(4)
s
=
s
=
min
(
s
)
f
m
i
,
j
i j
,
RB
m
m
m
The attack is defined as the segment between
the note onset and the end of the most positive of
the computed slopes, while the release segment
env
( )
j
−
env
( )
i
(2)
∀
i
,
j
∈
F
i
≤ + ∆
j
fr is
,
=
m
m
such as
i
,
j
j
−
i
Search WWH ::
Custom Search