Biomedical Engineering Reference
In-Depth Information
Therefore, a sequence of spaced motifs can be further abstracted into an ordered
collection of motifs interleaved by symbols (e.g., short, medium, and large) rep-
resenting a range of inter-motif distance. For instance, by considering the closed
intervals in Fig.
5.1
b, both sequences t
2
and t
3
in Fig.
5.1
a are represented by the
following sequence of spaced motifs:
h
x;
short
;y;
medium
;y
i
:
(5.1)
Each sequence of spaced motifs is described in a logic formalism which can be
processed by the ILP system SPADA (
S
patial
Pa
ttern
D
iscovery
A
lgorithm) [
24
]
to generate spatial association rules. More precisely, the whole sequence, the con-
stituent motifs and the inter-motif distances are represented by distinct constant
symbols.
3
Some predicate symbols are introduced in order to express both prop-
erties and relationships. They are:
sequence(t)
: t is a sequence of spaced motifs;
part of(t,m)
: The sequence t contains an occurrence m of single motif;
is a(m,x)
: The occurrence m is a motif x;
distance(
m
1
,
m
2
,
d)
: The distance between the occurrences m
1
and m
2
is
d
.
A sequence is represented by a set of
Datalog
4
ground atoms
, where a Datalog
ground atom is an n-ary predicate symbol applied to n constants. For instance, the
sequence of spaced motif in (
5.1
) is described by the following set of Datalog ground
atoms:
8
<
9
=
sequence
.t
2
/;
part of
.t
2
;m
1
/;
part of
.t
2
;m
2
/;
part of
.t
2
;m
3
/;
is
a.m
1
;x/;
is
a.m
2
;y/;
is
a.m
3
;y/;
distance
.m
1
;m
2
;
short
/;
distance
.m
2
;m
3
;
medium
/:
(5.2)
:
;
The set of Datalog ground atoms of all sequences is stored in the
extensional
part
D
E
of a deductive database D.The
intensional
part D
I
of the deductive database
D includes the definition of the domain knowledge in the form of
Datalog rules
.An
example of Datalog rules is the following:
short medium distance
.U;V/
distance
.U;V;
short
/:
short medium distance
.U;V/
distance
.U;V;
medium
/:
(5.3)
They state that two motifs
5
are at a
short medium distance
if they are at ei-
ther
short
or
medium
distance (Fig.
5.1
b). Rules in D
I
allows additional Datalog
3
We denote constants as strings of lowercase letters possibly followed by subscripts.
4
Datalog is a query language for deductive databases [
9
].
5
Variables are denoted by uppercase letters possibly followed by subscripts, such as U and V .
Search WWH ::
Custom Search