Databases Reference
In-Depth Information
the number of Itemsets in the sample set does the picture become a bit
more clear:
• Milk and no cookies
• Cookies and no milk
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
• Milk and cookies
The increased sample set allows the large number of milk and cookies Itemsets
to demonstrate the true aἀ nity between milk and cookies. In this way, the
milk and no cookies and cookies and no milk Itemsets decrease in signifi-
cance as the volume of data grows. Interestingly, they also prevent milk and
cookies from ever achieving a perfect aἀ nity, that is, 100% correlation.
The same illustration can be drawn for two objects that are substitutes.
For example, fifteen Itemsets wherein the two objects occur simultane-
ously would seem to indicate that those two objects are complements.
But then another fifteen thousand Itemsets wherein the two objects
occur exclusively of each other indicate that the two objects are actually
substitutes. Likewise, an independent object when investigated myopi-
cally may seem to have an aἀ nity for a specific object. Only when the
scope of that investigation is expanded do you realize that the aἀ nity
you discovered was between the time of day and the second object. The
first object was maintaining its independence while the second object
was increasing its occurrence during that time of day. For the reasons
shown in these examples, a large sample set can be expected to generate
conclusions that are more significant than those generated by a small
sample set.
The data gathered for Market Basket Analysis can be “large” in multiple
ways. These manifestations of a large sample set include the following:
Search WWH ::




Custom Search