Information Technology Reference
In-Depth Information
out and a k-nearest neighbor technique is used to
find those essays closest to a sample of human
graded essays; finally, eleven text complexity
features are used to assess the style of the essays.
Larkey conducted a number of regression trials,
using different combinations of components. She
also used a number of essay sets, including essays
on social studies, where content was the primary
interest and essay on general opinion where style
was the main criteria for assessment.
A growing number of statistical learning
methods have been applied to solve the problem
of automated text categorization in the last few
years, including regression models, nearest neigh-
bor classifiers, Bayes belief networks, decision
trees, rule learning algorithms, neural networks
and inductive learning systems (Ying, 1997). This
growing number of available methods is raising
the need for cross method evaluation.
But the most relevant problem in the field of
automated essay grading is the difficulty of ob-
taining a large corpus of essays (Christie, 2003;
Larkey, 2003) each with its own grade on which
experts agree. Such a collection, along with the
definition of common performance evaluation
criteria, could be used as a test bed for a standard-
ized comparison of different automated grading
systems. Moreover, these text sources can be used
to apply to automated essay grading the machine
learning algorithms well known in NLP research
field, which consist of two steps: a training phase,
in which the grading rules are acquired using vari-
ous algorithms, and a testing phase, in which the
rules gathered in the first step are used to determine
the most probable grade for a particular essay. The
weakness of these methods is the lack of a widely
available collection of documents, because their
performances are strongly affected by the size
of the collection. A larger set of documents will
enable the acquisition of a larger set of rules dur-
ing the training phase, thus a higher accuracy in
grading. A major part of these techniques, giving
training to the systems and later stage, making the
systems to learn from new essays or experience
is nothing but machine learning.
The feature set used with some modern AEG
systems include measures of grammar, usage,
mechanics, style, organization, development, lexi-
cal complexity, and prompt-specific vocabulary
usage. This feature set is based in part on the NLP
foundation that provides the instructional feedback
to students who are writing essays. In some cases
a web-based service evaluates a student's writing
skill and provides instantaneous score reporting
and diagnostic feedback. The score engine or score
reporter (see Figure 2) provides score reporting.
The diagnostic feedback is based on a suite of
programs (writing analysis tools) that identify the
essay's discourse structure, recognize undesirable
stylistic features, and evaluate and provide feed-
back on errors in grammar, usage, and mechanics.
The writing analysis tools identify five main types
of grammar, usage, and mechanics errors - agree-
ment errors, verb formation errors, wrong word
use, missing punctuation, and typographical errors.
The approach to detecting violations of general
English grammar is corpus based and statistical,
and can be explained as follows. In case of corpus
based systems the system is trained on a large
corpus of edited text.
4. ProBLEmS WItH tHE
PrESEnt SyStEmS undEr
tHE contExt of EngLISH
uSEd By StudEntS HAvIng
dIffErEnt motHEr tonguE
It has been found that most of the popular AEG
systems are made to grade English essays and
they are easy to follow. Systems developed
in non-English languages are not popular and
not understandable for everyone. Our research
shows that while system grades an English essay
it considers the influence of local languages as
Error. Hence the following two sentences (used
Search WWH ::




Custom Search