Evaluating Self-Explanations in iSTART: Word Matching, Latent Semantic Analysis, and Topic Models - Natural Language Processing and Text Mining

Information Technology Reference

In-Depth Information

is assessed by a ratio of the number of words in the explanation to the number in

the target sentence, taking into consideration the length criterion. For example, if

the length of the sentence is 10 words and the length priority is 1, then the required

length of the self-explanation would be 10 words. If the length of the sentence is 30

words and the length priority is 0.5, then the self-explanation would require a min-

imum of 15 words. Relevance is assessed from the number of matches to important

words in the sentence and words in the association lists. Similarity is assessed in

terms of a ratio of the sentence and explanation lengths and the number of matching

important words. If the explanation is close in length to the sentence, with a high

percentage of word overlap, the explanation would be deemed too similar to the tar-

get sentence. If the explanation failed any of these three criteria (Length, Relevance,

and Similarity), the trainee would be given feedback corresponding to the problem

and encouraged to revise the self-explanation.

Once the explanation passes the above criteria, then it is evaluated in terms of

its overall quality. The three levels of quality that guide feedback to the trainee are

based on two factors: 1) the number of words in the explanation that match either

the important words or association-list words of the target sentence compared to

the number of important words in the sentence and 2) the length of the explanation

in comparison with the length of the target sentence. This algorithm will be referred

as WB-ASSO , which stands for word-based with association list .

This first version of iSTART (word-based system) required a great deal of human

effort per text, because of the need to identify important words and, especially, to

create an association list for each important word. However, because we envisioned

a scaled-up system rapidly adaptable to many texts, we needed a system that re-

quired relatively little manual effort per text. Therefore, WB-ASSO was replaced.

Instead of lists of important and associated words we simply used content words

(nouns, verbs, adjectives, adverbs) taken literally from the sentence and the entire

text. This algorithm is referred to as WB-TT , which stands for word-based with to-

tal text . The content words were identified using algorithms from Coh-Metrix, an

automated tool that yields various measures of cohesion, readability, other charac-

teristics of language [9, 20]. The iSTART system then compares the words in the

self-explanation to the content words from the current sentence, prior sentences,

and subsequent sentences in the target text, and does a word-based match (both lit-

eral and soundex) to determine the number of content words in the self-explanation

from each source in the text. While WB-ASSO is based on a richer corpus of words

than WB-TT, the replacement was successful because the latter was intended for

use together with LSA which incorporates the richness of a corpus of hundreds of

documents. In contrast, WB-ASSO was used on its own.

Some hand-coding remained in WB-TT because the length criterion for an expla-

nation was calculated based on the average length of explanations of that sentence

collected from a separate pool of participants and on the importance of the sentence

according to a manual analysis of the text. Besides being relatively subjective, this

process was time consuming because it required an expert in discourse analysis as

well as the collection of self-explanation protocols. Consequently, the hand-coded

length criterion was replaced with one that could be determined automatically from

the number of words and content words in the target sentence (we called this word-

based with total text and automated criteria ,or WB2-TT ). The change from WB-TT

to WB2-TT affected only the screening process of the length and similarity criteria.

Its lower-bound and upper-bound lengths are entirely based on the target sentence's

Natural Language Processing and Text Mining

Search WWH ::

Custom Search

Home