Information Technology Reference
In-Depth Information
be overwhelming and the result set will probably be full of noise, trivial rules
or otherwise uninteresting associations. Therefore highly sophisticated access to
the cache is crucial in order not to overtax the analyst.
6.2 Enhancing the Association Mining Framework
Access to the rule cache depends on the actual mining scenario and the concrete
mining questions of the analyst. It therefore must be as flexible as possible in
order to be useful for a wide range of mining problems.
In Section 2 we treated items as atomic literals. This is the common way to
look at items in the context of association rule mining. We found that although
items are literals from the rule generation point of view, items indeed normally
have structure and moreover rule retrieval could greatly benefit from exploiting
this additional information.
The first step we do is breaking up the atomic items. For example items in
a supermarket have prices and costs associated. Accordingly production dates,
manufacturers etc are assigned to parts of vehicles. These attributes may be
incorporated into the mining run by quantitative association rules [28] or gener-
alized association rules [18,27], but typically this is not su - cient from the rule
retrieval point of view.
We extent the basic framework of association mining as follows: Let I⊆
IN × I A 1 ×···×I A m be the universe of items. Each item is uniquely identified
by an ID id ∈ IN and described by further attributes a 1 ,...a m ∈ I A 1 × ...×
I A m . Attributes can be from any desired domain I A n , e.g. prices, costs, dates
or other application dependent information. Also we treat the name of an item
as an attached attribute. Based on this extension a set of rules is defined as
R⊆P ( I ) ×P ( I ) × IR ×···× IR. As usually, in addition to the assumption
and the consequence (subsets of the power set of I ), each rule is rated by a
constant number of different and real valued quality measures. Adding attributes
to the items in such a way does not affect the mining procedure but nevertheless
introduces a newmeans to formulate practically important mining queries.
6.3 Rule Retrieval
In this paper we would like to describe a simple mining query language that
demonstrates the potentials of our approach. Our language is very useful on its
own but can also be integrated into universal environments as described in [11,
20,21,23].
Each query consists of the key word SelectRulesFrom followed by the name of
the rule cache to be accessed. In the following where-clause restrictions on the
rules to be retrieved are given. The basic query simply restricts the result set
by thresholds on rule quality measures. For example, we may want the system
to return all rules from the cache rulecache that have confidence above 85% and
lift of at least 15.
SelectRulesFrom rulecache
Where confidence > 0.85 and lift >= 15;
Search WWH ::




Custom Search