Databases Reference
In-Depth Information
the name accuracy [ Melnik et al. 2002 ] and is defined by the formula that follows:
2
1
Precision
Overall
D
Recall
(9.1)
Recall and precision are metrics that are presented later and evaluate the accuracy of
the generated matches intuitively. The overall metric evaluates the amount of work
an expert must provide to remove irrelevant matches (false positives) and to add
those relevant that were not discovered (false negatives) [ Do et al. 2003 ]. The metric
returns a value between
and 1. The greater the overall value is, the less effort
the designer has to provide. It is a general belief [ Do et al. 2003 ] that a precision
below 50% implies that more effort is required from the designer to remove the false
matches and add those missing than to manually do the matching. This is why such
situations have a negative overall value. A limitation of the overall metric is that it
assumes equal effort for removing an irrelevant match and for adding a missing one,
which is rarely the case in the real world.
Another metric to measure the human effort is the human-spared resources
(HSR) [ Duchateau 2009 ]. It counts the number of designer interactions required
to correct both precision and recall, i.e., to manually obtain a 100% f-measure, a
quality metric that is discussed later. In other words, HSR takes into account not
only the effort to validate or invalidate the discovered matches but also the effort
to discover those missing. HSR is sufficiently generic, can be expresse in the range
of Œ0; 1 or in time units (e.g., seconds), and does not require any input other than
the one for computing precision, recall, f-measure, or overall. The only limitation is
that it does not take into account the fact that some matching tools may return the
top-K matches instead of all of them.
In the schema mapping process, if the mapping specification is provided by the
designer and is not taken from the output of an automatic matching task, the situation
is different. The designer is required to provide input to the mapping tool through
its interface, not only at the beginning but also throughout the mapping genera-
tion process, since the designer will have to continuously verify the tool-generated
mappings and provide the respective modifications. Thus, the effort of the mapping
designer can be measured by the number of inputs the designer provides to the tool.
This evaluation criterion is essentially an evaluation of the graphical interface of
the tool. It is true that the more intelligence a tool incorporates in interpreting the
mapping designer input, the less input effort is required by the designer. However,
certain interfaces may be so well designed that even if there are many tasks the
mapping designer needs to do, the human effort is kept to the minimum.
STBenchmark introduces a simple usability (SU) model, intended to provide a
first-cut measure on the amount of effort required for a mapping scenario. It is based
on a rough counting of the mouse clicks and keystrokes to quantify effort. This is
important even if the time required for the mapping specification is much smaller in
comparison to the time needed by the generated mappings to become transformation
scripts and be executed. The click log information describing a mapping design for
STBenchmark looks like this: Right mouse click to pull up menu, left mouse click
to select a schema element, typing a function into a box, etc. Since different actions
1
Search WWH ::




Custom Search