Information Technology Reference
In-Depth Information
evaluation of the explanation as irrelevant or too short; 1, minimally acceptable; 2,
better but including primarily the local textual context; and 3, oriented to a more
global comprehension. Depending on the text, population, and LSA space used, our
results have ranged from 55 to 70 percent agreement with expert evaluations using
that scale. We are currently attempting to improve the effectiveness of our algo-
rithms by incorporating Topic Models (TM) either in place of or in conjunction
with LSA and by using more than one LSA space from different genres (science,
narrative, and general TASA corpus). We present some of the results of these efforts
in this chapter.
Our algorithms are constrained by two major requirements, speedy response
times and speedy introduction of new texts. Since the trainer operates in real time,
the server that calculates the evaluation must respond in 4 to 5 seconds. Further-
more the algorithms must not require any significant preparation of new texts, a
requirement precisely contrary to our plans when the project began. In order to
accommodate the needs of the teachers whose classes use iSTART, the trainer must
be able to use texts that the teachers wish their students to use for practice within
a day or two. This time limit precludes us from significantly marking up the text or
gathering related texts to incorporate into an LSA corpus.
In addition to the overall 4-point quality score, we are attempting to expand
our evaluation to include an assessment of the presence of various reading strategies
in the student's explanation so that we can generate more specific feedback. If the
system were able to detect whether the explanation uses paraphrasing, bridging, or
elaboration we could provide more detailed feedback to the students, as well as an
individualized curriculum based on a more complete model of the student. For ex-
ample, if the system were able to assess that the student only paraphrased sentences
while self-explaining, and never used strategies such as making bridging inferences
or knowledge-based elaborations, then the student could be provided additional
training to generate more inference-based explanations.
This chapter describes how we employ word matching, LSA, and TM in the
iSTART feedback systems and the performance of these techniques in producing
both overall quality and reading strategy scores.
6.2 iSTART: Feedback Systems
iSTART was intended from the outset to employ LSA to determine appropriate
feedback. The initial goal was to develop one or more benchmarks for each of the
SERT strategies relative to each of the sentences in the practice texts and to use
LSA to measure the similarity of a trainee's explanation to each of the benchmarks.
A benchmark is simply a collection of words, in this case, words chosen to represent
each of the strategies (e.g., words that represent the current sentence, words that
represent a bridge to a prior sentence). However, while work toward this goal was
progressing, we also developed a preliminary “word-based” (WB) system to provide
feedback in our first version of iSTART [19] so that we could provide a complete
curriculum for use in experimental situations. The second version of iSTART has
integrated both LSA and WB in the evaluation process; however, the system still
provides only overall quality feedback. Our current investigations aim to provide
feedback based on identifying specific reading strategies.
Search WWH ::




Custom Search