Information Technology Reference
In-Depth Information
It follows a description of the annotation algorithm which works stage by
stage beginning on stage 1. The input parameters are any
p
∈
P
with
|
p
|
=
k
and any
s
=
l
and
l<k
.
There are at most
l
turns. The current stage number is denoted by
n
.Atthe
beginning of turn
n
, there is a certain number
an
n
of already annotated sections
of the pattern
p
and a certain number
un
n
of not yet annotated sections with
an
n
+
un
n
≤
∈
M
+
with
|
s
|
l
.
1and
un
n
=1,then
p
has the structure
p
=
p
1
p
u
p
2
where
1. If
an
n
=
l
−
p
1
,p
2
∈
X
)
∗
,
p
u
∈
X
)
+
, and exactly the symbols of
p
u
do not
(
M
∪
(
M
∪
yet carry any annotation.
Annotate exactly all symbols of
p
u
by
n.
1;
an
n
+1
:=
l
;
un
n
+1
:= 0; exit the
procedure.
2. If
an
n
=
l
2and
un
n
=2,then
p
has the structure
p
=
p
1
p
1
p
2
p
2
p
3
where
−
X
)
+
, and exactly the symbols of
p
1
and
p
2
do not yet carry any annotation.
(a) If
p
1
=
p
2
, then annotate exactly all symbols of
p
1
by
n.
1andallsymbols
of
p
2
by
n.
2;
an
n
+1
:=
l
;
un
n
+1
:= 0; exit the procedure.
(b) If
p
1
p
1
,p
2
,p
3
∈
X
)
∗
,
p
1
,p
2
∈
(
M
∪
(
M
∪
=
p
2
, then annotate exactly all symbols of
p
1
by
n.
1;
an
n
+1
:=
l
−
1;
un
n
+1
:= 1.
3. Otherwise,
an
n
+
un
n
<l
and
un
n
>
1. In this case, the pattern
p
has an
initial segment
p
1
p
u
p
2
≺
p
where
p
1
,p
2
∈
X
)
∗
,
p
u
∈
X
)
+
,and
(
M
∪
(
M
∪
exactly the symbols of
p
u
do not yet carry any annotation.
Look for some longest initial segment
r
p
u
which occurs repeatedly in
p
u
or in any other following not yet annotated sections (occurrences should be
non-overlapping).
(a) If there is no such
r
which occurs at least twice, take the first letter
z
of
p
u
, i.e.,
z
p
u
, and annotate it by
n.
1;
an
n
+1
:=
an
n
+1;if
z
=
p
u
,then
un
n
+1
:=
un
n
−
∈
M
∪
X
and
z
1 else
un
n
+1
:=
un
n
.
(b) If
r
is found, look for a maximal number
f
of non-overlapping occurrences
in not yet annotated parts such that the following conditions are met,
where
f
denotes the resulting number of newly annotated sections and
g
denotes the resulting number of not yet annotated sections left over.
i. The value
g
does not exceed the value
l
−
an
n
−
f
.
f
does not exceed the remaining number of
symbols without any annotation.
If the conditions are valid, annotate these
f
sections by
n.
1to
n.f
,resp.,
and set
an
n
+1
:=
an
n
+
f
and
un
n
+1
:=
g
, accordingly.
ii. The value
l
−
an
n
−
As a result, the whole pattern
p
is separated into sections being annotated by
terms of the form
u.v
,where
u
is not exceeding
l
. The ultimately resulting
number of sections is exactly
l
.
The resulting annotation is interpreted as some pattern, where each section
annotated by some
u.v
is understood to represent the variable
x
u
. For instance,
in fig. 5 this pattern is
x
1
x
2
x
3
x
4
x
1
x
3
. In figure 6, the pattern is
x
1
x
2
x
3
x
4
x
1
x
3
x
5
.
This pattern does always generalize
p
, but does not necessarily generalize
s
.