Information Technology Reference
In-Depth Information
Fig. 1.6. Margin points, applied on the units 1-3
Table 1.3. Results of the table algorithm evaluation
number of documents
86
number of tables
92
number of correctly found tables
51
number of partially found tables
34
number of not found tables
8
number of found tables
236
Recall
91,73 %
Precision
43,74 %
To compare the results and further increase recall, it was also decided to store
all terms in a separate index, showing them additionally to the results from the
table search, together with their nearest table caption.
1.6.2
Case Study for Table Extraction
With the dawn of high-throughput analysis, biological papers have started to
contain a lot more information than the actual focus of the paper would suggest.
A simple experiment on diabetic rat, for example, will concentrate on a very few
proteins and genes. Still, there are dozens of other proteins and genes involved in
the process, but this information usually just ends up in a table in the appendix.
A scientist concentrating on one of these proteins and genes may never know
about the effects on diabetic rats, as this information never shows up in the
abstract or via standard search methods. With our approach, we can find the
tables the proteins or genes are mentioned in and show the user the table caption
to have them easily assess the context.
Search WWH ::




Custom Search