Information Technology Reference
In-Depth Information
such as these, and the signatures they generate, may come to form a web of soft
constraints that could help us improve the confidence we have that a retrieved text
or textual unit matches a target set of constraints.
If future research offers continued e cacious signatures then an array of indices
can be imagined. Once achieved, a discriminant analysis between corpora such as
the SP, ST, and DT outlined in this study could be conducted. Such testing would
lend substantial support to a textual signatures approach to text identification.
Looking even further ahead, we would also like to extend our signatures research
beyond the type of texts presented in this study. For example, we need to consider
the signatures generated from articles with multiple experiments as well as articles,
essays, and reports from other fields. It is reasonable to expect that any identifiable
genre is composed of elements, and that those elements exposed to methods such
as those used in this study will produce identifiable and therefore distinguishable
signatures.
While a great deal of work remains to be done, we believe that LSA-based
textual signatures contributes to the field by offering a useful and novel approach
for computational research into text mining.
7.13 Acknowledgments
This research was supported by the Institute for Education Sciences (IES
R3056020018-02). Any opinions, findings, and conclusions or recommendations ex-
pressed in this material are those of the authors and do not necessarily reflect the
views of the IES. We would also like to thank David Dufty and Mike Rowe for their
contributions to this study.
References
1. Best, R.M., Floyd, R.G., & McNamra, D.S. (2004). Understanding the fourth-
grade slump: Comprehension di culties as a function of reader aptitudes and
text genre. Paper presented at the 85th Annual Meeting of the American Ed-
ucational Research Association.
2. Biber, D. (1987). A textual comparison of British and American writing. Amer-
ican Speech, 62, 99-119.
3. Biber, D. (1988). Linguistic features: algorithms and functions in variation
across speech and writing. Cambridge: Cambridge University Press.
4. Brill, E. (1995). Unsupervised learning of disambiguation rules for part of
speech tagging. In Proceedings of the Third Workshop on Very Large Corpora,
Cambridge, MA.
5. Britton, B. K., & Gulgoz, S. (1991). Using Kintschs computational model to
improve instructional text: Effects of inference calls on recall and cognitive
structures. Journal of Educational Psychology, 83, 329-345
6. Burrows, J. (1987). Word-patterns and story-shapes: The statistical analysis of
narrative style. Literary and Linguistic Computing, 2, 6170.
7. Charniak, E. (1997) Statistical Parsing with a context-free grammar and word
statistics Proceedings of the Fourteenth National Conference on Artificial In-
telligence, Menlo Park: AAAI/MIT Press
Search WWH ::




Custom Search