Database Reference
In-Depth Information
The support: This assesses the rule's coverage or ''how many records the rule
constitutes.'' It denotes the percentage of records that match the antecedents.
The confidence: This assesses the strength and the predictive ability of the
rule. It indicates ''how likely the consequent is, given the antecedents.'' It denotes
the consequent percentage or probability, within the records that match the
antecedents.
The lift: This assesses the improvement in the predictive ability when using
the derived rule compared to randomness. It is defined as the ratio of the rule
confidence to the prior confidence of the consequent. The prior confidence is
the overall percentage of the consequent within all the analyzed records.
In the presented example, Rule 2 associates product 1 to product 4 with a
confidence of 71.4%. In plain English, it states that 71.4% of the baskets containing
product 1, which is the antecedent, also contain product 4, the consequent.
Additionally, the baskets containing product 1 comprise 77.8% of all the baskets
analyzed. This measure is the support of the rule. Since six out of the nine total
baskets contain product 4, the prior confidence of a basket containing product 4 is
6/9 or 67%, slightly lower than the rule confidence. Specifically, Rule 2 outperforms
randomness and achieves a confidence about 7% higher with a lift of 1.07. Thus
by using the rule, the chances of correctly identifying a product 1 purchase are
improved by 7%.
Rule 4 is more complicated since it contains two antecedents. It has a lower
coverage (44.4%) but yields a higher confidence (75%) and lift (1.13). In plain
English this rule states that baskets with products 1 and 3 present a strong chance
(75%) of also containing product 4. Thus, there is a business opportunity to
promote product 4 to all customers who check out with products 1 and 3 and have
not bought product 4.
The rule development procedure can be controlled according to model
parameters that analysts can specify. Specifically, analysts can define in advance
the required threshold values for rule complexity, support, confidence, and lift in
order to guide the rule growth process according to their specific requirements.
Unlike decision trees, association models generate rules that overlap. There-
fore, multiple rules may apply for each customer. Rules applicable to each customer
are then sorted according to a selected performance measure, for instance lift or
confidence, and a specified number of n rules, for instance the top three rules,
are retained. The retained rules indicate the top n product suggestions, currently
not in the basket, that best match each customer's profile. In this way, association
models can help in cross-selling activities as they can provide specialized product
recommendations for each customer. As in every data mining task, derived rules
should also be evaluated with respect to their business meaning and ''actionability''
before deployment.
Search WWH ::




Custom Search