Information Technology Reference
In-Depth Information
the blank tape letter β when the stack is not empty, it enters the accept state f (Rule (g)). If
the PDA encounters an a on its input tape when in state p ,an a has been received after a b
and the input is rejected (Rule (i)). After the PDA enters either the accept or reject states, it
remains there (Rules (j) and (k)).
In Section 4.12 we show that the languages recognized by pushdown automata are exactly
the languages defined by the context-free languages described in the next section.
4.9 Formal Languages
Languages are introduced in Section 1.2.3 .A language is a set of strings over a finite set Σ ,
with
2, called an alphabet . Σ is the language of all strings over Σ including the empty
string , which has zero length. The empty string has the property that for an arbitrary string
w , w = w = w . Σ + is the set Σ without the empty string.
In this section we introduce grammars for languages, rules for rewriting strings through
the substitution of substrings. A grammar consists of alphabets
|
Σ
|≥
T
N
of terminal and
non-terminal symbols, respectively, a designated non-terminal start symbol, plus a set of rules
R
and
for rewriting strings. Below we define four types of language in terms of their grammars:
the phrase-structure, context-sensitive, context-free, and regular grammars.
The role of grammars is best illustrated with an example for a small fragment of English.
Consider a grammar G whose non-terminals
N
contain a start symbol S denoting a generic
sentence and NP and VP denoting generic noun and verb phrases, respectively. In turn, assume
that
N
also contains non-terminals for adjectives and adverbs, namely AJ and AV .Thus,
N
=
{
S , NP , VP , AJ , AV , N , V
}
. We allow the grammar to have the following words as terminals:
T
.Here bob , alice ,and duck are nouns,
big is an adjective, smiles and quacks are verbs, and loudly is an adverb. In our fragment of
English a sentence consists of a noun phrase followed by a verb phrase, which we denote by the
rule S
=
{
bob , alice , duck , big , smiles , quacks , loudly
}
of the grammar are shown below. They include
rules to map non-terminals to terminals, such as N
NP VP . This and the other rules
R
bob
bob
smiles
S
NP VP
N
V
alice
quacks
NP
N
N
V
duck
loudly
NP
AJ N
N
AV
big
VP
V
AJ
VP
VAV
With these rules the following strings (sentences) can be generated: bob smiles ; big duck
quacks loudly ;and alice quacks . The first two sentences are acceptable English sentences,
but the third is not if we interpret alice as a person. This example illustrates the need for rules
that limit the rewriting of non-terminals to an appropriate context of surrounding symbols.
Grammars for formal languages generalize these ideas. Grammars are used to interpret
programming languages. A language is translated and given meaning through a series of steps
the first of which is lexical analysis . In lexical analysis symbols such as a , l , i , c , e are grouped
into tokens such as alice , or some other string denoting alice . This task is typically done with
a finite-state machine. The second step in translation is parsing , a process in which a tokenized
string is associated with a series of derivations or applications of the rules of a grammar. For
example, big duck quacks loudly , can be produced by the following sequence of derivations:
S
NP VP ; NP
AJ N ; AJ
big ; N
duck ; VP
VAV ; V
quacks ; AV
loudly .
Search WWH ::




Custom Search