Information Technology Reference
In-Depth Information
Table 2. A list of songs of varying genres used to determine the presence of vocal segments
Artist
Song
Genre
Fiona Apple
Get Gone
Pop/Rock
Ben Folds Five
Narcolepsy
Pop/Rock
The Dave Matthews Band
Lie In Our Graves
Pop/Rock
Ronnie James Dio
Holy Diver
Classic Rock, Heavy Metal
Astrud Gilberto and Stan Getz
The Girl from Ipanema
Jazz, World Music
Al Green
Not Tonight
R & B
Hall & Oates
Private Eyes
Pop/Rock
Hall & Oates
You Make My Dreams
Pop/Rock
Jimi Hendrix
All Along the Watchtower
Classic Rock
Michael Jackson
Rock With You
Pop, R & B
Jane's Addiction
Been Caught Stealing
Alternative Rock
Led Zeppelin
The Song Remains the Same
Classic Rock
Bob Marley
Three Little Birds
Reggae
Juana Molina
Martin Fierro
World Music, Electronic, Pop
Nirvana
Drain You
Alternative Rock
Nirvana
Heart Shaped Box
Alternative Rock
Jim O'Rourke
Memory Lame
Pop/Rock
Pearl Jam
Alive
Alternative Rock
Pearl Jam
Last Exit
Alternative Rock
Pink Floyd
Time
Classic Rock
Radiohead
The Tourist
Alternative Rock
Lionel Ritchie
All Night Long
Pop, R & B
Stereolab
Diagonals
Pop/Rock, Electronic
U2
Where the Streets Have No Name
Pop/Rock
Wilco
Hummingbird
Alternative Rock
analysis frequencies, ω, associated with frequency
sub-band, k .
13 th sub-bands (roughly 5-16 kHz) where we see
an average difference of about 4 dB.
Since volumes tend to swell and drop over
the duration of the song, this must be determined
locally, across 25-30 second segments (about five
or six slices in Figure 5). Simple k-means cluster-
ing can be used to identify the onset of vocal or
nonvocal segments within a localized slice. We
find this technique to be effective for MusicStory's
purpose, which is to say the technique works well
on standard rock/pop recordings one might find
on an iPod or personal video device. Our end re-
sult is a song segmented into vocal and nonvocal
partitions (see Figure 6).
S
(
k
)
LFPC
(
k
)
=
10
log
(4)
10
M
In Figure 5, we show the mean LFPCs for
both vocal and instrumental segments and the
difference between the two. The figure supports
Nwe and Wang's claim that vocal segments tend
to have more high-frequency energy than purely
instrumental segments, however the difference
found here is considerably more subtle than the
one New and Wang presented (2004). We notice
the greatest LFPC difference in the 11 th , 12 th , and
 
Search WWH ::




Custom Search