Databases Reference
In-Depth Information
Customers who
buy
beer
and
sausage
also tend to buy
hamburger
with {confidence = 0.7}
in {support = 0.2}
Customers who buy
strawberries
also tend to buy whipped cream
with {confidence = 0.8}
in {support = 0.15}
Fig. 3.
Association rules
and to distinguish between a cause and a concomitant effect.” The issue of
causal ordering is also often of importance to those modeling causality in data
discovery.
Data mining analyzes non-experimental data previously collected. There
are several different data mining products. The most common are
conditional
rules
or
association rules
. Conditional rules are most often drawn from induced
trees while association rules are most often learned from tabular data.
At first glance, association rules (Fig. 3) seem to imply a causal or cause-
effect relationship. That is:
A customer's purchase of both sausage and beer
causes
the customer
to also buy hamburger.
But, all that is discovered is the
existence
of a statistical relationship between
the items. They have a degree of joint occurrence. The
nature
of the relation-
ship is not identified. Not known is whether the presence of an item or sets
of items causes the presence of another item or set of items, or if some other
phenomenon causes them to jointly occur.
The information does not have a good decision value unless the degree
of causality is known. Purely accidental relationships do not have the same
decision value, as do causal relationships. For example,
IF it is true that buying both
beer
and
sausage
somehow causes
someone to
buy
beer
,
•
Then: A merchant might profitably put
beer
(or the likewise as-
sociated
sausage
)onsale
•
And at the same time: Increase the price of
hamburger
to com-
pensate for the sale price.
On the other hand, knowing that
Bread
and
milk
are often purchased together.
may not be useful information as both products are commonly purchased on
every store visit.