Java Reference
In-Depth Information
•
A subset of the states called the accepting ,or final , states
These components of an FA can be represented graphically as shown in Fig-
ure 3.1.
An FA also can be represented graphically using a transition diagram ,
composed of the components shown in Figure 3.1. Given a transition diagram,
we begin at the start state. If the next input character matches the label on a
transition from the current state, then we go to the state to which it points.
If no move is possible, then we stop. If we finish in an accepting state, the
sequence of characters read forms a valid token; otherwise, a valid token has
not been seen. In the transition diagram shown in Figure 3.1, the valid tokens
are the strings described by the regular expression (abc + ) + .
As an abbreviation, a transition may be labeled with more than one char-
acter (for example, Not( c )). The transition may be taken if the current input
character matches any of the characters labeling the transition.
3.4.1 Deterministic Finite Automata
An FA that always has a unique transition (for a given state and character) is
a deterministic finite automaton (DFA). DFAs are simple to program and are
often used to drive a scanner. ADFA is conveniently represented in a computer
by a transition table . A transition table, T , is a two-dimensional array indexed
by a DFA state and a vocabulary symbol. Table entries are either a DFA state
or an error flag (often represented as a blank table entry). If we are in state s
and read character c ,then T [ s , c ] will be the next state we visit, or T [ s , c ] will
contain an error flag indicating that c cannot extend the current token. For
example, the regular expression
//(Not(Eol)) Eol
which defines a Java or C
single-line comment, might be recognized by the
DFA shown in Figure 3.2(a). The corresponding transition table is shown in
Figure 3.2(b).
A full transition table will contain one column for each character. To
save space, table compression is sometimes utilized. In that case, only nonerror
entries are explicitly represented in the table. This is done by using hashing or
linked structures [CLRS01].
Any regular expression can be translated into a DFA that accepts (as valid
tokens) the set of strings denoted by the regular expression. This transla-
tion can be done manually by a programmer or automatically by a scanner
generator.
++
 
 
Search WWH ::




Custom Search