Direct Execution Learning Technology - Webble Technology

Information Technology Reference

In-Depth Information

Some Refinement by Jantke and Grieser. The Lange/Wiehagen algorithm

has been discussed from the viewpoint of knowledge eciency in [14]. Within

the process of learning, there are two out of three cases in which the algorithm is

loosing information. First, when some string longer than the current hypothesis

does occur-case [L/W i] in the description of the algorithm above-it is ignored.

Second, when some string shorter than the current hypothesis does occur-case

[L/W ii] indicated above-the hypothesis kept so far is given up and the shorter

string is adopted to form the new hypothesis. Certain refinements seem possible.

In the first case [L/W i], checking whether or not the longer string carries

useful information is costly, because it means checking membership. Does the

current pattern allow for generating this string or not? This decision problem is

known to be NP-hard and, thus, the task of refinement is not taken into account.

In the second case [L/W ii], all information collected so far in the latest

hypothesis is thrown away when the newly appearing shorter string takes the

place of the hypothesis. Isn't there any way to preserve at least some of the

information in the pattern? A refinement proposed by Jantke and Grieser [14]

is considered for the purpose of the present investigation.

Assume that the Lange/Wiehagen algorithm as described above is currently

keeping some hypothetical pattern p of length

= k . Furthermore, assume

that some new string s of length

= l with l<k is presented to the learning

algorithm. Instead of throwing p away and adopting s as it is, the refinement is

to calculate the new hypothesis as some least general generalization of p and s .

p 1

s 1

p n

p n |

s n +1 |

[J/G i]

p n +1 =

[J/G ii]

At a first glance, this description seems to be more rough-instead of being more

fine-than the algorithm

lgg ( p n ,s n +1 ) if

p n |≥|

s n +1 |

A above, because it has only two many cases, whereas

the other one has three. However, case [L/W ii] is really refined.

Conceptually, one may think of this refinement as follows. First, the pattern

x 1 ...x l is taken as a possible generalization, because it definitely generalizes

both p n and s n +1 . Next, traversing the pattern x 1 ...x l symbol by symbol from

the left to the right it is specialized.

The algorithm is not of high computational complexity, but a bit cumbersome

due to a larger number of distinct cases and their dovetailing.

To warm up, some intuitive example is presented in which the references on

the right refer to the algorithm to be presented below. The following figure is

illustrating the annotation process for the pattern p = ax 1 x 2 x 3 bx 4 ccax 1 x 2 x 3 x 4 c

of length k = 14 and the string abacac with l = 6. The goal is to generalize the

pattern p by some other pattern of length l = 6. The algorithm runs in turns.

In every turn, certain identical substrings are found and indexed.

As a result of the procedure illustrated by means of figure 5, the pattern is

partitioned into l = 6 substrings. The partitioning represented by the last line of

the figure above is interpreted as the intermediate pattern x 1 x 2 x 3 x 4 x 1 x 5 which

generalizes p .

Webble Technology

Search WWH ::

Custom Search

Home