Database Reference
In-Depth Information
Table 1. OPUS IR Filter
Algorithm: OPUS IR Filter(Current, Available, M )
1. SoFar :=
2. FOR EACH P in Available
2.1 New := Current P
2.2 IF New satisfies all the prunable constraints in M except the nontrivial [8]
constraint THEN
2.2.1 IF any direct subset of New has the same coverage as New THEN
New relevant stats is a trivial rule
Any superset of New is trivial, so do not access any children of this node,
go to step 2.
2.2.2 ELSE IF the mean of New relevant stats is significantly higher than all its
direct parents THEN
IF the rule satisfies all the other non-prunable constraints in M
THEN record Rule to the ordered rule list
OPUS IR Filter(New, SoFar, M )
SoFar := SoFar P
2.2.3 END IF
2.3 END IF
3. END FOR
M
: is a set of constraints. There are two types of constraints prunable and
unprunable constraints . Prunable constraints are constraints that you
can derive useful bounds for search space pruning and still ensures the
completeness of information. Examples include the anti-monotone, the
succinct constraints [7], or the convertible constraints [9]. Constraints
which are not prunable are unprunable constraints
λ
{X → Y }×{D}→R
:
is a function from rules and databases to value and
define a interestingness metric such that the greater the value of
X →
Y,D ) the greater the interestingness of this rule given the database.
λ
(
I
: is the set of impact rules that can be derived from
D
, whose antecedents
are conjunctions of one or more conditions in
C
, whose targets are mem-
bers of
T
, and which satisfy the constraints in
M
.
k
: is a user specified integer number denoting the number of rules in the
ultimate solution for this task.
The original algorithm for impact rule discovery with filters are described
in table 1. In this table, current is the set of conditions, whose supersets are
currently being explored. Available is the set of conditions that may be added
to current . By adding every condition in
available
to
current
one by one, we
form the antecedent of the current rule :
New → target
, which will be referred to
later as
current rules
. Rule list is an ordered list of the top-k interesting rules
we have encountered.
5
Ecient Identification of Exploratory Rule Significance
5.1
Deriving Difference Set Statistics Without Data Access
According to the algorithm in table 1 and definition 1, we have to compare the
mean of current rule with the means of all its direct parents' in order to decide
whether a rule is significant or not. The set difference operations necessary for
 
Search WWH ::




Custom Search