Information Technology Reference
In-Depth Information
interpersonal interactions, as it is faster and simpler to call another person to con-
vey a message than to write a traditional letter, an e-mail or a text message. This
form of conveying information is much more dominant in interpersonal contacts
for purely practical reasons, although it has to be noted that the younger genera-
tion - the 'computer' or 'mobile' generation - prefers to transmit information over
phones or modems. Still, the advantages of speech are incomparably greater than
of written text, even if we should not underestimate or in any way belittle the ad-
vantages of written text.
Speech recognition itself very frequently boils down to recognising sentences
in which the vocabulary, the sequence of words and the dependency on the
speaker play important roles. The vocabulary, as the factor determining the selec-
tion of the speech recognition method, is split into small or large vocabulary sets.
In addition, these sets may be limited or unlimited, which in the case of limited
sets may cause wrong recognitions, and in the case of unlimited ones - recognition
ambiguity. Such sets also have one more disadvantage, as we should remember
that our vocabulary and the sequence of words is interrupted (with pauses) or it
has no pauses, and then the places where one word significant for the sentence be-
ing recognised ends and the next word begins have to be market independently.
Another important element in recognising speech is its reference to a specific
speaker or the lack of such a reference. In this regard, we say that speech is
speaker-dependent, or is independent of any speaker. At this point, it is also worth
adding that speech (sentences spoken by a specific person) does not always have
significant content, and what is more, is frequently corrected or changed by the
speaker while speaking. For this reason, speech recognition is much more compli-
cated than recognising written text, which, once written, does not change during
the analysis. We are also frequently surprised while speaking by some spontaneity
which slips into our sentences, whether we want it or not. This is another element
of instability of a statement and its volatility which is evidence to the difficulties
in speech recognition.
Speech recognition starts when the sentence is said. This sentence becomes the
original data for the analysis process, and then the possibility of certain interfer-
ence appearing has to be accounted for. This interference includes not just system
interference, but also all kinds of interference coming from the environment. At
this stage of the process, the recognising system treats the sentence as a noisy sen-
tence within a channel of noise, so the recognising system immediately begins the
decoding stage, which will produce a recognised sentence. The entire recognition
process resembles what happens in a school hallway, where student A standing at
one end of it says sentence x , while student B standing at the other end must un-
derstand the sentence spoken to him by student A. During this short discussion be-
tween students A and B , loud school din prevails in the hallway. Regardless of the
terrible noise, student B may hear sentence x , but he might also only guess the
content of the sentence said by student A , and then there are two possible solution
of the problem : either he successfully guesses the contents of sentence x , or it will
be distorted and student B will recognise completely different words in the
sentence and will build sentence y of them. This situation illustrates the speech
Search WWH ::




Custom Search