Information Technology Reference
In-Depth Information
organize and index their collections according
to artist, album, song as expressed by metadata.
Also, users apply this organization principle.
However, there are far more structuring styles
to be observed. A user study showed that several
categories are used to structure a private collection
(Jones, Cuuningham, & Jones, 2004). Another one
observes that hierarchical structures (taxonomies)
are used. Remarkably, the study states that all the
users had a folder for music which does not fit the
structure (Vignoli, 2004). In a student project,
we received 39 different hierarchical organiza-
tions for the same music collection (Homburg,
Mierswa, Möller, Morik, & Wurst, 2005). We also
experienced that there were leftover nodes in the
taxonomies where the students no longer wanted
to annotate the music, or where they could not
make up their mind where to put a song. Beyond
the well-studied classification into genres, there
seems to be a need of automatically classifying
music into individual, hierarchical structures.
Observations of users also indicate a retrieval
task beyond searching for a particular song. Us-
ers search for yet unknown music which might
be interesting (Aucouturier & Pachet, 2002).
New music which fits a user's taste is most often
recommended by friends, as was clearly stated
by an empirical analysis of (music) information
seeking behavior (Lee & Downie, 2004). Rec-
ommending music is a retrieval task in its own
right with a link to social networks (friend of a
friend) (Celma, Ramirez, & Herrera, 2005), also
marked as sociocultural aspect (Baumann, Pohle,
& Shankar, 2004). For the structuring of music,
the co-occurrence in personal collections was
found the most useful ground truth (Berenzweig,
Logan, Ellis, & Whitman, 2003). This, again,
stresses the collaborative nature of building up
private music collections. Another empirical
study shows that—in addition to searching for a
particular song— users want to browse through
collections of other users (Taheri-Panah & Mac-
Farlane, 2004). The Nemoz system allows users to
browse through the collection of another user in
a peer-to-peer network (Aksoy, Burgard, Flasch,
Kaspari, Lüttgens, Martens, & Möller, 2005). In
addition to simply looking at the collection, the
extra service of what we called “goggling” allows
users to look at the other collection through their
“own glasses”. This enables users to discover new,
interesting music according to their own taste.
It moves beyond finding yet another song of a
favorite artist. Each user has its own taxonomy
which stores the personal collection. We exploit
the structure of these taxonomies in order to guide
a tour through the media collections of other
users. This ensures that users navigate through
other media collections in a similar way like
they navigate through their own collection. More
technically spoken, the learner classifiers used to
automatically sort songs into the own taxonomy
are applied to another user's collection.
The service of classifying songs automatically
into the user's taxonomy is rather straight-forward
from a user's point of view. An intelligent music
management system inputs the user's taxonomy
and a set of music files— the not yet tagged ones
from their own collection or from that of another
user. The system outputs the set of music pieces
together with tags, which correspond to the user's
taxonomy. From a technical point of view, this
is not at all straight-forward. The system needs
to learn the implicit classification of the user.
Considering each taxonomy node (i.e., tag) a
class and the songs which are stored at that node
(i.e., labeled by the tag). To its members, this
looks like the standard machine learning task of
classification. Each audio file is represented by a
set of random variables, such as X, that describe
features of this audio file. Y is another random
variable that denotes to which class the audio
file belongs. These obey a fixed, but unknown
probability distribution Pr ( X,Y ). The objective
of classification, is to find a function h ( X ) which
predicts the value of Y. The major challenge in
audio classification, and the reason why the user's
Search WWH ::




Custom Search