Information Technology Reference
In-Depth Information
points of patterns only, usually conveying more
importance than endings.
Rolland (1999) defined a numerical similarity
distance between sub-sequences based on edit
distance. In order to extract patterns, similarity
distances were computed between all possible
pairs of sub-sequences of a certain range of
lengths, and only similarity exceeding a user-de-
fined arbitrary threshold was selected. From the
resulting similarity graph, patterns are extracted
using the Star-Center categorization algorithm
(detailed in last section). The set of discovered
patterns is reduced even further using pre-defined
constraints such as minimal or maximal length,
maximal length difference between pattern occur-
rences, and minimal number of pieces where the
pattern can be found. The last constraint, called
“quorum”, corresponds to the k factor in Conklin
& Anagnostopoulou (2001).
distinct strategies have been proposed. The fre-
quent pattern mining approach is restricted to
patterns that have a number of occurrences (or sup-
port ) exceeding a given minimum threshold (Lin
et al., 2002). The main drawback of this strategy
comes from the arbitrariness and poor adaptation
of the definition of the threshold value. Another
approach is based on the search for maximal pat-
terns , that is, patterns that are not included in any
other pattern (Zaki, 2005; Agrawal & Srikant,
1995). This heuristic enables a more selective
filtering of redundancy. For instance, in figure
9, the suffix aij can be immediately discarded
following this strategy, since it is included in the
longer pattern abcde : its properties can be directly
induced from the long pattern itself. However,
this approach still leads to an excessive filtering
of important structures. For instance, in Figure
10, the same 3-note pattern aij presents a specific
configuration that cannot be directly deduced from
the longer pattern abcde , for the simple reason that
its support (or number of occurrences) is higher
than the support of abcde . This corresponds to
discussion
As we can see, all approaches 4 in motivic pat-
tern extraction propose diverse ways to solve
the major problem of combinatorial explosion of
redundant structures: the core pattern extraction
process results in a very large set of candidates,
of poor interest to musicology or music informa-
tion retrieval. All approaches therefore include
post-processing algorithms that filter the results
in order to select the most relevant structures. The
main limitation of this strategy comes from the
lack of selectivity of the global criteria. Hence,
by selecting longest patterns, one may discard
frequent short motives (such as the 4-note pattern
from Beethoven's Fifth Symphony ), or nonfrequent
long patterns, that may nevertheless be considered
as highly relevant by listeners.
Figure 9. Occurrences of patterns are extracted
from the score. These occurrences are displayed
below the score, and the patterns are grouped in a
tree above the score. Pattern aij, a suffix of abcde
with identical support (i.e., two occurrences), is
therefore non-closed, and should not be explicitly
represented.
closed pattern mining
The problem of reducing the combinatorial com-
plexity of pattern structure is largely studied in
current research in computer science, and several
Search WWH ::




Custom Search