Database Reference
In-Depth Information
Essentially, this language provides a small set of built-in primitives such as
smiles_file for reading a data file, minimum_frequency for specifying a
minimum support constraint and maximum_frequency for specifying a max-
imum support constraint. For each of these primitives, the system is aware of
the properties such as (anti-)monotonicity, which ensures that any conjunction
or disjunction of constraints that is written is down can be processed by the
system.
Similar special purpose languages were proposed by several other authors [ 22 , 29 ];
they differ in the constraints that are supported and the type of patterns that can be
found (itemsets [ 22 , 29 ], strings [ 12 , 19 ], ... ).
Languages built on SQL A clear disadvantage of special purpose languages is
that they are yet additional languages that the programmer has to learn. Given that
many datasets are stored in databases, several projects have studied the integration
of constraint-based pattern mining in database systems.
The first class of such methods aims to extend SQL with additional syntax for the
formalization of data mining tasks. One early example is the MINE RULE operator
[ 21 ]:
This example mines association rules with minimum support 0 . 1, confidence 0 . 2,
limiting the search to items with a price lower than $ 150, a succinct constraint.
Another example is the DMQL language [ 15 ]:
In this example we search for association rules related to three specific products, in
those transactions that have a value higher than 100; the parameters of the association
rule discovery process are similar to the previous example. A third example is SPQL
[ 7 ].
The advantage of these languages is that well-known syntax can be used for the
expression for constraints. Furthermore, common SQL syntax can be used to specify
the input of the mining task or to process its output further.
At the same time, the programmer still has to learn the additional primitives,
such as the FIND or MINE RULE keywords. An alternative perspective is to avoid
extending the language, but to add mining views to a database [ 4 ]. They are virtual
Search WWH ::




Custom Search