Collaborative Cross-media Annotation of Documents - Human-Computer Interaction

Information Technology Reference

In-Depth Information

classification of tokens was made by a domain expert (a fifth-year computer science

student).

We examined three approaches. All of them used semantic knowledge to adapt

the dictionary that is used by the handwriting recognition engine. They are based on

the assumption that annotations made on lecture slides frequently contain words that

are also written on these slides. In particular, this takes into account domain-specific

words, which are typically not included in the standard dictionary.

Approach 1: Dictionary from all slides The first experiment adds the tokens from

all slides of the given lecture to the dictionary. Hence, the dictionary is the same for

all annotations from our corpus. It contains 2283 words taken from the 42 slides of

the lecture. We will refer to this dictionary as Dictionary A.

Approach 2: Dictionary from current slides The second experiment uses differ-

ent dictionaries for annotations made on different slides. The dictionary for a spe-

cific annotation contains all tokens extracted from the slide the annotation is located

on. We will refer to these dictionaries as Dictionaries B.

Approach 3: Sliding-window dictionary The third experiment relies on a sliding-

window approach. Again, annotations made on different slides have different dic-

tionaries. The dictionary for a specific annotation contains all tokens extracted from

the slide the annotation is located on and all tokens from the preceding five and the

following five slides. If the slide is amongst the first or last five slides, the smaller

number of all preceding or all subsequent slides is used. We experimented with dif-

ferent numbers of preceding and subsequent slides and found that in our case, the

number of five slides provides the best results. As a matter of course, this number

depends on the specific slide set. We refer to these dictionaries as Dictionaries C.

Improved Performance

Table 5.3 gives the recognition results for domain-specific words for all three types

of dictionaries and contrasts them with the baseline. The use of either dictionary

significantly reduces the word error rate. Dictionary B (all tokens from the current

slide) clearly outperforms the other dictionaries. In contrast to using no domain-

specific dictionary, the results show a relative word error rate reduction of almost

20 %. It is not surprising that the character error rate did not decrease, since the

dictionary is not used for recognition on the level of individual characters.

Tabl e 5. 3 Performance of the handwriting recognition for domain-specific terms

Word error rate (%) Character error rate (%)

Baseline: No domain-specific dictionary

45.3 %

18.2 %

Dictionary A (all slides)

41.2 %

16.4 %

Dictionaries B (current slide)

36.5 %

16.2 %

Dictionaries C (sliding window)

41.8 %

17.4 %

Human-Computer Interaction

Search WWH ::

Custom Search

Home