Information Technology Reference
In-Depth Information
not retrieve at all the relevant document in 10%
of the queries. This is a negative aspect of PB,
due to the fact that its units do not overlap and,
for short queries, it may happen that none of the
note sequences match with the segmented units.
A similar consideration applies also to MO, but
this effect seems to be bounded by the fact that
MO units are shorter.
The performances of the different approaches
depending on the presence of errors in the query
are shown on the left of Figure 6, which reports
the average precision of the approaches. Apart
from FL, the other segmentation techniques had
a clear drop in the performances, also when a
single error was introduced. In particular, PB
and MO showed a similar negative trend, almost
linear with the number of errors. It is interesting to
note that DD, even if its performances are almost
comparable to FL in the case of a correct query,
had a faster degradation in performances. The
average precision depending on query length is
shown on the right of Figure 6. Similar consider-
ations can be made on the trends of the different
approaches. PB and MO had a similar behavior,
and also in this case FL was the one with the
best performances. It can be noted that, when the
queries are moderately shortened, the average
precision of FL and DD is almost constant. The
drop in performances appears earlier, and more
remarkably, for DD than for FL.
From the analyses, it appears that simple ap-
proaches to segmentation, which have redundant
information through overlapping units, give better
performances than approaches based on music
perception or music theory. Moreover, fixed-
length segmentation was more robust to errors in
the queries and to short queries than data-driven
segmentation. From these results, it seems that
for music indexing an approach that does not fil-
ter out any information, improves recall without
degrading precision. The good performances
of FL can be also due to the fact that it has the
shortest average length of index terms (actually,
being N-grams, the average correspond to the
length of all the index terms), and hence local
perturbations due to errors in the query do not
affect a high number of indexes.
The fact that a simple approach to melodic seg-
mentation such as FL outperforms all other ones
that are based on content specific characteristics
is somehow counterintuitive. For this reason, a
number of experiments have been carried out in
order to highlight the best configuration of the
parameters for each approach. The results reported
in Table 4 and Figure 6 are the best ones achieved
by each approach. It has to be noted that the
overall performances are biased by the particular
implementation of the different segmentation
algorithms, and this is particularly true for PB
and MO. The aim of the study was not to state
which is the best approach, but to compare the
experimental results of different implementations
using a common testbed.
In addition to the results of the segmentation
algorithms, Figure 6 reports also the average preci-
sion of a fifth approach, named FUS, which with
this particular setting outperforms all the others
in terms of robustness to errors and short queries.
The approach is discussed in the next section.
parallel Indexes
Up to this point, the discussion has been carried
out assuming that only one index is built on a
document collection, eventually using a combina-
tion of features. In general, this is a reasonable
approach, because the creation of an index file
is computationally costly, and may require a
remarkable amount of memory storage. On the
other hand, different indexes may capture different
characteristics of a document collection, which
usefulness may depend on the user information
need, on the way the query is created, and on the
approach to evaluation of retrieved documents
carried out by the user. The presence of a number
of alternative indexing schemes can be exploited
by running a number of parallel retrieval sessions
Search WWH ::




Custom Search