Scanning—Theory and Practice - Crafting a Compiler

Java Reference

In-Depth Information

Figure 3.26: Example FA before merging.

How do we decide what states to merge? We take a greedy approach and

try the most optimistic merger. By definition, accepting and nonaccepting

states are distinct, so we initially try to create only two states: one representing

the merger of all accepting states and the other representing the merger of all

nonaccepting states. Having only two states is almost certainly too optimistic.

In particular, all of the constituents of a merged state must agree on the same

transition for each possible character. That is, for character c all of the merged

states either must have no successor under c or must go to a single (possibly

merged) state. If all constituents of amerged state donot agree on the transition

to follow for some character, then the merged state is split into two or more

smaller states that do agree.

As an example, assumewe start with the FA shown in Figure 3.26. Initially,

we have a merged nonaccepting state

{

}

and a merged accepting state

{

. A merger is legal if, and only if, all constituent states agree on the same

successor state for all characters. For example, states 3 and 6 would go to

an accepting state when given character c ; states 1, 2, and 5 would not, so a

split must occur. We add an error state s E to the original DFA that will be

the successor state under any illegal character. (Thus, reaching s E becomes

equivalent to detecting an illegal token.) s E is not a real state. Rather, it allows

us to assume that every state has a successor under every character. s E is never

merged with any real state.

Algorithm S

}

, shown in Figure 3.27, splits merged states whose con-

stituents do not agree on a single successor state for a particular character.

When S

plit

terminates, we know that the states that remainmerged are equiv-

alent in that they always agree on common successors.

Returning to the example, we initially have states

plit

{

}

and

{

}

Invoking S

, we first observe that states 3 and 6 have a common successor

under c and states 1, 2, and 5 have no successor under c (or, equivalently, they

have the error state s E ). This forces a split that yields

plit

{

}

{

}

,and

{

}

Now, for character b , states 2 and 5 go to the merged state

{

}

,butstate1

does not, so another split occurs. We now have

.At

this point, all constituents of merged states agree on the same successor for

each input symbol, so we are done.

{

}

{

}

{

}

,and

{

}

Crafting a Compiler

Search WWH ::

Custom Search

Home