Databases Reference
In-Depth Information
￿ < indicted
|
bombing
|
CRIME >
|
|
￿ < indicted
1991
TIME >
The advantage of Basic Elements is that it features a deeper semantic analysis than simple
n-gram evaluation so that matches need not be superficial, but the disadvantage is that it relies
on parsing and pruning, which can be error-prone for noisy data such as speech and blogs. Like
ROUGE, Basic Elements is not a single evaluation metric. Rather it consists of numerous modules
relating to three evaluation steps of breaking , matching and scoring , which correlate to locating the
basic elements, matching similar basic elements, and scoring the summaries, respectively. Basic
Elements can be seen as a generalization of ROUGE, with ROUGE being the special case where
the basic elements are n-grams.
The Pyramid method [ Nenkova and Passonneau , 2004 ] uses variable-length sub-sentential
units for comparing machine summaries to human model summaries. These Semantic Content Units
Pyramid
method
(SCUs) are derived by having human annotators analyze multiple human model summaries for units
of meaning. Each SCU is roughly equivalent to a concept, though SCU itself is not formally defined.
Semantic
Content
Units
Each SCU can have many different surface realizations. For example, the following two sentences
relate to the same SCU:
￿ They decided to use bluetooth.
￿ The final design included bluetooth.
The label for this SCU might be The remote control used blue-tooth . Each SCU is associated
with a weight relating to how many model summaries it occurs in. For instance, Figure 2.5 shows
an example in which we have five model summaries and each model summary contains a subset
of six SCUs. In this example, the weight for SCU 1 will be 4 (because it appears in four model
summaries), the weight for SCU 2 will also be 4, while the weight for SCU 3 will be only 3, etc. These
varying weights lend the model the pyramid structure, with a small number of SCUs occurring
in many model summaries and most SCUs appearing in only a few model summaries. Machine
summaries are then annotated for SCUs as well and can be scored based on the sum of SCU weights
compared with the sum of SCU weights for an optimal summary. Figure 2.6 shows a Pyramid for our
example in Figure 2.5 , containing two SCUs of weight 4 (SCU 1 ,SCU 2 ) , and four SCUs of weight 3
(SCU 3 , .., SCU 6 ) , and two possible optimal summaries containing four SCUs are indicated. These
summaries are optimal because they each contain all of the SCUs of weight 4, the highest weight
level, and the remaining SCUs from weight 3, the next highest level of the Pyramid.
Using the SCU annotation, one can calculate both precision-based and recall-based summary
scores for a given machine summary. For instance, a machine summary containing the four SCUs
(SCU 1 ,SCU 3 ,SCU 4 ,SCU 6 ) would have a precision of 13 / 14, i.e., the sum of the weights of the
SCUs contained in the summary ( 4
3 ) , divided by the sum of the weight of an optimal
summary containing the same number of SCUs ( 4 + 4 + 3 + 3 ) . In contrast, in the recall-based
Pyramid score, instead of comparing the machine summary with the ideal summary containing the
+
3
+
3
+
Search WWH ::




Custom Search