Information Technology Reference
In-Depth Information
the “meaning” of a movie or television show for a collaborative filtering system
(see Resnick & Varian 1997) is the set of ratings members of a user community
have assigned to it. Users of such a system can be said to form a group to the ex-
tent that they have given similar ratings to the same items (cf., Lashkari 1995).
For the most part these newer technologies (from sociology and from AI col-
laborative filtering research) for understanding stories as locations in and/or
producers of social networks pay scant attention to the form and content of
the stories: from this perspective stories are mostly “black boxes.”
While the sociologists and AI, collaborative filtering researchers “black
box” the form and content of stories, the corpus-based, computational lin-
guistics and information retrieval researchers “black box” the social context of
the stories they index (cf., Manning & Schutze 2000). Corpus-based compu-
tational linguistics is most often performed on large corpora described as, for
instance, “10 million words from several volumes of the Wall Street Journal,”
or “1 million words from a wide variety of text genres.” How the authors of
the texts included in the corpora interact with one another or are related to
one another is not factored into the analysis of the corpus. The one exception
to this anonymity of authors is the use of corpus-based techniques for author
identification purposes. But, even in these cases, the task is usually to deter-
mine who, among a small set of possible candidates, is the most likely author
of a given text. The social network that incorporates (or the fact that no known
social network incorporates) the set of candidate authors is not something that
is often taken into account in the design of the corpus-based, computational
linguistic methods of analysis.
The techniques of corpus-based, computational linguistics are oftentimes
technically related to the techniques employed by sociologists since both sets
of techniques can depend upon similar tools from statistics and information
theory (e.g., measures of mutual information and entropy). But the techniques
are inverses of one another due to the fact that what the sociologists black-
box in their analyses is almost exactly what the corpus-based linguistics and
information technology researchers do not black-box in their own research,
and vice versa.
Any significantly new methodology for the development of a technology of
story understanding should involve the combination of these two approaches.
To understand a story as both (1) embedded in and (re)productive of both a
network of related stories and other forms of discourse, and (2) as a facilitator
or inhibitor of social networks, it is necessary to explore how social and seman-
tic networks overlap. This intersection of social network and content analysis
has been envisioned in sociology and linguistics (e.g., Milroy 1978). However,
Search WWH ::




Custom Search