Information Technology Reference
In-Depth Information
As currently developed, automata theory and experience of informa-
tion retrieval systems can continue to contribute an understanding of
the effects of different logical combinations on the number of records
retrieved, particularly for searches using combinations of separate words.
Exemplification and Discussion
We can now reconsider the examples given previously, in chapter 5, for
searching on separate words and on phrases.
Word
The difficulty of finding two uniquely co-occurring words in the docu-
ments indexed by Google ( Unquiet cooccurrences / Quotidian meta-
morphise / Happiest legumens ) can be understood from linguistics,
information theory, and the experience accumulated by using information
retrieval systems (Google 2004). Linguistics, particularly through the idea
of the paradigm as a mentally held network of associations, can indicate
the difficulty of excluding semantic considerations from the choice of
terms for searching for unique co-occurrences, comparable to the diffi-
culty of humanly formulating a random sequence 5 —the repressed signi-
fied can reemerge. The operational implementation of the word—broadly,
a sequence of characters between spaces or space and punctuation mark—
is consistent with the conception derived from information theory (that
is, a cohesive group of letters with strong internal statistical influences).
The rarity of combinations of two such cohesive units, linked by AND,
resulting in a single document can be understood in terms of the wide dis-
tribution of the cohesive units across documents. In addition, the rarity of
combinations of two cohesive units resulting in a single document can be
understood as continuous with but still contrasting to the historical expe-
rience of using information retrieval systems with documents recalled
by simple Boolean combinations, particularly with reducing the number
of documents retrieved by combining sets with AND. In a more deliber-
ately theoretical, rather than directly experiential, perspective, combining
entities assumed to be independently distributed with AND is analogous
to multiplying odds or fractions. Fuller index descriptions and possibly
transformations of the full texts of documents are now available in his-
torical (including recent historical) experience.
Search WWH ::




Custom Search