Information Technology Reference
In-Depth Information
The theory of formal languages studies four main phenomena concerning formal
languages:
1.
equivalence
results, showing that different methods and formal devices define
the same languages. An example of such a kind of results is given in the next
section where an equivalence will be given which implies an equivalence be-
tween regular grammars and the class of finite state automata;
2.
normalization
results, showing that some formal devices defining languages can
be put in some specific formats which allow to simplify the analysis of specific
aspects. For example, any grammar with monotone rules generates the same lan-
guage of a grammar where rules have the
context sensitive form
.
3.
decidability
results, showing that some class of problems about languages can
be solved in an algorithmic way, by reaching the solution in a finite number of
steps. For example, given a grammar
CF
it is possible to decide if it generates a
finite or infinite language;
4.
closure
results showing that some classes of languages are closed with respect to
some language operations. For example, the sum of two
CF
languages is a
CF
language too, but
CF
is not closed with respect to complementation (with respect
to the set of strings over its terminal alphabet), while
REG
and
CS
are closed with
respect to all boolean operations (sum, intersection, and complementation).
6.3
Regular Expressions and Finite Automata
Consider singleton languages, that is, languages constituted by only one string. For
example
. Then, let us construct languages by applying to them the
language operation of sum, concatenation, and Kleene star, with an implicit priority
of star with respect to concatenation, and of concatenation, with respect to sum. For
example:
{
a
},{
b
},{
c
}
}
∗
+
{
}
∗
.
{
a
b
}{
c
Languages built in this manner are called
regular languages
. If we adopt the ab-
breviation of eliminating set parentheses, by assuming that strings specify singleton
languages, then we get
regular expressions
which naturally identify regular lan-
guages. For example, the language above is simply denoted by the following regular
expression (parentheses can be introduced for a better reading of expressions):
a
∗
+
bc
∗
.
Regular expressions are related to finite state automata.
A finite state automaton
M
is a device equipped with: i) an
alphabet
, ii) a finite
set of
internal states
, one of which is the
initial state
, and some of which are the
final states
(see Fig. 6.1), iii) plus a set of
transition rules
which determine the
following behavior. When a string
of an alphabet of the automaton is put on its
tape, then
M
reads it by starting from the first symbol on the left of
α
and in its initial
state. After reading a symbol, in dependence of its transition rules,
M
changes its
state and moves to read the next symbol on the right. When the whole string
α
α
was