Biomedical Engineering Reference
In-Depth Information
Therefore, a sequence of spaced motifs can be further abstracted into an ordered
collection of motifs interleaved by symbols (e.g., short, medium, and large) rep-
resenting a range of inter-motif distance. For instance, by considering the closed
intervals in Fig. 5.1 b, both sequences t 2
and t 3
in Fig. 5.1 a are represented by the
following sequence of spaced motifs:
h x; short ;y; medium ;y i :
(5.1)
Each sequence of spaced motifs is described in a logic formalism which can be
processed by the ILP system SPADA ( S patial Pa ttern D iscovery A lgorithm) [ 24 ]
to generate spatial association rules. More precisely, the whole sequence, the con-
stituent motifs and the inter-motif distances are represented by distinct constant
symbols. 3 Some predicate symbols are introduced in order to express both prop-
erties and relationships. They are:
sequence(t) : t is a sequence of spaced motifs;
part of(t,m) : The sequence t contains an occurrence m of single motif;
is a(m,x) : The occurrence m is a motif x;
distance( m 1 , m 2 , d) : The distance between the occurrences m 1 and m 2 is d .
A sequence is represented by a set of Datalog 4 ground atoms , where a Datalog
ground atom is an n-ary predicate symbol applied to n constants. For instance, the
sequence of spaced motif in ( 5.1 ) is described by the following set of Datalog ground
atoms:
8
<
9
=
sequence .t 2 /;
part of .t 2 ;m 1 /; part of .t 2 ;m 2 /; part of .t 2 ;m 3 /;
is a.m 1 ;x/; is a.m 2 ;y/; is a.m 3 ;y/;
distance .m 1 ;m 2 ; short /; distance .m 2 ;m 3 ; medium /:
(5.2)
:
;
The set of Datalog ground atoms of all sequences is stored in the extensional part
D E of a deductive database D.The intensional part D I of the deductive database
D includes the definition of the domain knowledge in the form of Datalog rules .An
example of Datalog rules is the following:
short medium distance .U;V/ distance .U;V; short /:
short medium distance .U;V/ distance .U;V; medium /:
(5.3)
They state that two motifs 5 are at a short medium distance if they are at ei-
ther short or medium distance (Fig. 5.1 b). Rules in D I
allows additional Datalog
3 We denote constants as strings of lowercase letters possibly followed by subscripts.
4 Datalog is a query language for deductive databases [ 9 ].
5 Variables are denoted by uppercase letters possibly followed by subscripts, such as U and V .
 
Search WWH ::




Custom Search