Information Technology Reference
In-Depth Information
From the perspective of information theory, therefore, the multiword
sequence of written language can be understood as a linear concatena-
tion of units weakly correlated with one another; the units themselves—
understood as words—are internally cohesive. This conception of the
multiword sequence is consistent with other aspects of the application
of information theory to written language, specifically incorporating
Shannon's definition of the word and building on the implicit assump-
tion of linearity. Consistency with the definition of the word ensures fur-
ther consistency with the conception of the messages for selection which
were understood as individual characters of the Roman alphabet. This
understanding was incorporated into the definition of the word. Coupled
with effective real-world reference and, in this instance, computational
applicability, internal theoretical consistency provides some assurance of
the validity of the definitions of the word and multiword sequence. The
understanding of the multiword sequence of written language is also sim-
ple, economic, and apparently novel, revealing simplicity and economy in
the brevity of the understanding given. This impression of novelty is sup-
ported by its logical dependence on Shannon's conception of the word,
itself not widely adopted despite its potential value.
The conception of the multiword sequence of written language as a
linear concatenation of units weakly correlated with one another has
other conceptual advantages, particularly through its correlation with
other independently knowable features of the development of written
language and related and independent theoretical perspectives. In rela-
tion to the historical development of written language, the space can be
regarded as historically introduced and persisting where transition pos-
sibilities between two adjacent letters would otherwise be extensive and
the potential for prediction of the second of the two letters from the first
to be low, particularly for purely syntactically based prediction. After
a space, the initial character of a word likely will be difficult to predict
syntactically and will have low redundancy (some forms of writing—for
instance, Biblical Hebrew—have indicated the presence of vowels at the
beginning of words, but not within them). From the related perspective of
critiques of the analogical application of information theory to semantic
issues, the greatest freedom of choice at a semantic level occurs in the
selection of the next word, where constraints at the syntactic level are
weakest, reinforcing the evidence for the inapplicability of information
Search WWH ::




Custom Search