Information Technology Reference
In-Depth Information
all three of our subject groups, not (as one might
guess) from only the NONM group. While the
number of subjects in each group is not large
enough for statistically significant generalizations,
this result does confirm that musical training is
no guarantee of good performance in reproducing
pitch intervals.
Perhaps most importantly, this experiment
has shown the similar degree to which both
musicians and nonmusicians enlarge small pitch
intervals and compress large ones, whether the
intervals are moving upward or downward. It also
showed that the accuracy of subjects' humming
when measured even in a method as imprecise
as ternary pitch contour was lower than reported
in other studies. It must be remembered, though,
that these results were obtained for unfamiliar
melodic phrases, and informal inspection of the
input queries for the first two experiments show
better pitch contour performance than was seen in
this experiment. In addition, some of the stimulus
phrases used both in Lindsay's study and in our
own contain dissonant intervals, which occur
only rarely in popular vocal melodies and are
more difficult to sing or hum.
consistent choices in selecting a starting pitch
when humming a given song. We hypothesized
that variation would be reduced when a song
contained a large pitch range, but our data did
not support it. An examination of the data for
individual subjects showed that musicians chose
a wider variety of starting pitches for different
songs than the other two groups.
Musicians were more likely than the other
groups to match the performance key of a song
they had just heard when humming it, unless the
song was beyond their comfortable vocal range.
Nonmusicians were nearly equally adept as musi-
cians when choosing a starting pitch for humming
a song having a wide pitch range.
A significant difference was found between
the duration a note was voiced and the inter-note
onset time (INOT) corresponding to it. The dif-
ference increased as INOT increased. Musicians
had a higher duration/INOT ratio than the other
groups, but the improvement was seen mostly with
notes longer than one-half second. INOT values
are a more reliable representation of note timings
for vocal input than note duration.
Pitch interval and pitch contour reproduc-
tion skills for musicians on unfamiliar musical
phrases were lower than reported in an earlier
study (Lindsay, 1996). Nonmusicians performed
substantially worse than musicians in both stud-
ies. All subject groups tended to exaggerate the
size of small pitch intervals and to compress
large intervals, as reported informally by McNab
et al. (1996). We further showed that subjects
with musical training generally compress large
intervals less than those without, and that for all
subjects upward intervals are compressed more
than downward intervals.
If the musically untrained cannot reliably
reproduce even the simple pitch contour of a
musical phrase, then representations using more
complex contour schemes such as the five-category
MPEG-7 Melody Contour description (i.e., a lot
higher vs. a little higher) will not be effective for
vocal input. Similarly, MIR systems expecting
correct or nearly-correct contour information will
studY conclusIons
Our experiments yielded several insights into
characteristics which influenced our own develop-
ment of algorithms for matching hummed input
to a music database.
In almost every measured statistic, the relative
performance of subjects with 2-5 years music
experience as a group was indistinguishable from
those with less than two years experience. On
the other hand, in most cases those with more
than five years of musical training performed
significantly better than the other two groups.
Of course, individual cases vary widely from the
group norms; natural musical ability can make
up for a lack of formal training.
We confirmed the results of Halpern (1989)
and of Levitin (1994) that subjects make fairly
Search WWH ::




Custom Search