Database Reference
In-Depth Information
Fig. 1. The parallel intersection Approach for ABCDE
Note: in this proof, mean ( A → target ) denotes the target mean of the records covered
by rule A → target, variance ( A → target ) denotes the target variance of the records
covered by rule A → target,whilemean ( R ) denotes the target mean of the records in
record set R,andvariance ( R ) represents the target variance of the records in R.
By deriving the difference set statistics from the statistics of the
parent rule
and
in table 1, we are able to save data access and computation
for collecting the statistics for performing the significance test, thus improve the
eciency of the search algorithm.
New → target
5.2
The Circular Intersection Approach
Parallel Intersection Approach. According to the definition of significant
impact rules, we compare the current rule with all its direct parents to identify its
significance. In theoriginalOPUSIR Filter algorithm, the procedure described
in figure 1 is employed to find the
of every direct parent of the current
rule which is being explored. Each arrow in figure 1 represents an intersection
operation. When deciding whether a rule with 5 conditions, namely
coverset
A
,
B
,
C
,
D
and
on the antecedent is significant or not, the algorithm has to go through
16 intersection operations! We refer to this approach as the parallel intersection
approach.
By examining figure 1, we notice that there are considerable overlaps in
the parallel intersection approach . For example, by using the parallel inter-
section approach, we have to do the same intersection of
E
coverset
A
(
)and
coverset
B
coverset
ABCD
coverset
ABCE
(
) three times, when searching for
(
),
(
)
coverset
ABDE
and
(
). There must be a way in which two of these operations
can be omitted.
Circular Intersection Approach. we propose the approach of circular inter-
section which is shown in figure 2 3 . In this approach, intersections are done in
two stages. Firstly, in the forward stage , intersections are done from condition
A to condition E
one at a time, and the results are kept in memory. Then we
3
Each dashed arrow in figure 2 and figure 3 points to the outcome of that specific
intersection operation and does not represent an actual operation.
Search WWH ::




Custom Search