Information Technology Reference
In-Depth Information
extension and the “local” or “simple” rules (see above). All extensions corre-
sponding to branches to the right of the perfect extension are still possible for
the fragment reached by the perfect extension. Therefore branches to the right
can be pruned immediately without any loss: they cannot contain any closed
fragment, because the perfect extension cannot be done in them without violat-
ing the canonical form, but has to be done in order to reach a closed fragment.
This type of pruning we call partial perfect extension pruning (because it prunes
only part of the branches aside from the perfect extension one). Note that Close-
graph [17] uses only this form of pruning.
How partial perfect extension pruning changes the search tree for the molecules
in Figure 11.1 is shown in Figure 11.6. Note that only non-closed fragments
are removed from the search tree (compare to Figure 11.2, in which the closed
fragments are highlighted). The gains consist in the fact that the two pruned
fragments need not be processed: neither do they have to be checked for canonical
form nor do we have to consider possible extensions of them.
11.6
Full Perfect Extension Pruning
Although partial perfect extension pruning is already highly effective, it is de-
sirable to prune also the search tree branches to the left of the perfect exten-
sion, thus completing partial perfect extension pruning into full perfect extension
pruning . In order to do so, we must not restrict the extensions of the fragment
that resulted from a perfect extension as it would be required by canonical form
pruning (with or without the “local” or “simple” rules). Otherwise we could lose
(closed) frequent fragments, as we demonstrated above. In other words, we would
like to have a search tree like the one shown in Figure 11.7 for the molecules
shown in Figure 11.1.
The core problem with this is how we can avoid that the fragment O-S-C-N is
pruned as non-canonical. The breadth-first search canonical code word for this
fragment is S 0-C1 0-O2 1-N3 . However, with the search tree in Figure 11.7 it
is assigned the code word S 0-C1 1-N2 0-O3 , because this reflects the order in
S
S F
S C
S O
O S C
S C N
S C O
O S C N
S C N
O
S C N C
O S C N
O
S C N C
O
Fig. 11.7. Search tree for the three molecules in Figure 11.1 with full perfect extension
pruning (crossed out branches are pruned)
 
Search WWH ::




Custom Search