Information Technology Reference
In-Depth Information
Specific Topics
memory management and patching as all con-
nections are treated the same way. However,
some applications like audio feature extraction
require a variety of different buffer sizes to flow
through the network (for example feature vectors
typically have much lower dimensionality than
audio data). Even though it is possible to have
dynamic buffer sizes in Explicit Patching it is
complex to implement and frequently requires a
lot of work from the programmer to appropriately
set the connections. In addition, these fixed-sized
buffers are reused for holding spectral data and it
is up to the programmer to correctly connect the
spectral data to objects that process such data.
The result is that the exact details of the Short-
Time Fourier Transform are encapsulated as a
black box and the programmer has little control
over the process. Our proposed solution to these
two problems is to extend the semantics of the
data that is processed. In MARSYAS, processing
objects ( MarSystems ) operate on chunks of data
called Slices . Slices are matrices of floating point
numbers characterized by three parameters:
number of samples (things that are “measured”
at different instances in time), number of obser-
vations (things that are “measured” at the same
time instance) and sampling rate. This approach
is similar to the Sound Description Interchange
Format (SDIF) (Schwarz & Wright, 1997).
Figure 4 shows a MarSystem for spectral pro-
cessing that converts an incoming audio buffer of
512 samples of 1 observation at a sampling rate of
22050 Hz to 1 sample of 512 observations (the FFT
bins) at a lower sampling rate of 22050/512 Hz. By
propagating information about the sampling rate
and number of observations through the dataflow
network, the use of Slices provides more correct
and flexible semantics for spectral processing and
feature extraction. MarSystems are designed so
that they can handle Slices with arbitrary dimen-
sions with one important constraint: they need to
be able to calculate their output Slice parameters
from their input Slice parameters. For example it
is possible to change the input number of samples
In this section we discuss in more detail some
specific topics that we believe are particularly
interesting to the designer of audio processing
frameworks.
Implicit Patching
The basic idea behind Implicit Patching (Bray &
Tzanetakis, 2005) is to use object composition
rather than explicitly specifying connections be-
tween input and output ports in order to construct
the dataflow network. For example the following
pseudo-code example (Figure 3) illustrates the
difference between Explicit and Implicit Patching
in a simple playback network.
The idea of Implicit Patching evolved from
the integration if three different ideas that were
developed independently in previous versions
of MARSYAS . These three ideas and how they
are integrated are described below. In addition,
examples illustrating the expressive power of
Implicit Patching are presented.
The first idea originated from the desire not
to be constrained to fixed buffer sizes and to
have proper semantics for spectral data. The
majority of existing audio processing environ-
ments requires that all processing objects in a
flow network/visual patch, process fixed buffers
of audio samples (typical numbers are 64 and
128 samples). Having fixed buffers simplifies
Figure 3. Explicit and implicit patching
#EXPLICTPATCHING
createsource,gain,dest
#connecttheappropriatein/outports
connect(source.out1,gain.in1);
connect(gain.out1,dest.in1);
#IMPLICITPATCHING
createsource,gain,dest
#createacompositethatisthenetwork
createseries(source,gain,dest)
Search WWH ::




Custom Search