Information Technology Reference
In-Depth Information
4.2 Pattern Inference: The Problem and Its Solutions
Patterns are usually understood as descriptions of something of general interest,
some structure, some behavior, or the like. Those patterns do not show directly.
What is perceived are the instances of some pattern. The key application problem
is to reconstruct some unknown pattern from the instances observed. In formal
terms, some pattern p shall be learned from the sequence of all strings in L ( p ).
Even more formally, assume s 1 ,s 2 ,s 3 ,s 4 ,... to be any ordering of the elements
of some formal language L ( p ). There is the question for an algorithm powerful
enough to solve this learning problem for any target pattern-one that learns all.
Learning is some process which expands over time. Any learning algorithm
A
,
in processing some sample S n =
{
s 1 ,...,s n }
of observed instances, generates a
certain hypothetical pattern
( S n ) shortly named p n . With information growing
over time, the learning algorithm generates some sequence p 1 , p 2 , p 3 , p 4 ... of
pattern hypotheses. It is said that
A
does solve the current learning problem
successfully, exactly if at some point k in time, the hypothesis p k is correct, i.e.
L ( p )= L ( p k ), and this hypothesis is kept even if further instances are observed.
A
Dana Angluin's Classical Algorithm. Dana Angluin [2] has provided the
first solution of the pattern inference problem under consideration. Her approach
is intuitive and easy to understand. Even to prove its correctness is quite easy.
It is helpful to introduce the notion of consistence. Given any sample set
M + , a pattern p is said to be consistent with S , if and only if p is able to
generate all strings of S , i.e. S
S
L ( p ). Recall that any consistent pattern which
fits best is said to be descriptive.
Dana Angluin's idea summarized in common speech is roughly as follows.
Given some current pattern hypothesis p n and some current instance s n +1 ,check
whetherornotthehypothesis generated before is consistent with the new data.
If so, keep the hypothesis. Otherwise, compute any pattern descriptive of all
instances seen so far. Based in this astonishingly clear idea, one is able to learn
any pattern from its instances.
In more formal terms, the sequence of hypotheses generated by
A
on some
sequence of instances s 1 ,s 2 ,... is defined as follows, where S n means
{
s 1 ,...,s n }
.
p 1
=
s 1
p n
if
s n +1
L ( p n )
[DA i]
p n +1 =
L ( p n ) [DA ii]
The term descr denotes some procedure computing a descriptive pattern of the
argument sample. The paper [2] contains a proof of the algorithm's correctness.
Dana Angluin has also provided an algorithmic principle for implementing descr .
D = { p | S ⊆ L ( p ) }
The set of all patterns consistent with some sample is finite up to renaming of
variables. With the requirement that variables are canonically numbered from
left to right, D becomes finite.
D
descr ( S n +1 ) if
s n +1
=
{
p
|
S
L ( p )
p canonical
p of maximal length
}
Every pattern in D which is minimal w.r.t.
is descriptive of S .Inthisway,
the procedure descr is computable. The algorithmic learning problem is solved.
 
Search WWH ::




Custom Search