An Introduction to Frequent Pattern Mining - Frequent Pattern Mining

Database Reference

In-Depth Information

patterns. In general association rules can be considered a “second-stage” output,

which are derived from frequent patterns. Consider the sets of items U and V . The

rule U ⇒ V is considered an association rule at minimum support s and minimum

confidence c , when the following two conditions hold true:

1. The set U

V is a frequent pattern.

2. The ratio of the support of U

∪

V to that of U is at least c .

The minimum confidence c is always a fraction less than 1 because the support of

the set U

V is always less than that of U . Because the first step of finding frequent

patterns is usually the computationally more challenging one, most of the research in

this area is focussed on the former. Nevertheless, some computational and modeling

issues also arise during the second step, especially when the frequent pattern mining

problem is used in the context of other data mining problems such as classification.

Therefore, this topic will also discuss various aspects of association rule mining

along with that of frequent pattern mining.

A related problem is that of sequential pattern mining in which an order is present

in the transactions [ 5 ]. Temporal order is quite natural in many scenarios such as

customer buying behavior, because the items are bought at specific time stamps, and

often follow a natural temporal order. In these cases, the problem is redefined to

that of sequential pattern mining, in which it is desirable to determine relevant and

frequent sequences of items.

Some examples of important applications are as follows;

∪

•

Customer Transaction Analysis: In this case, the transactions represent sets of

items that co-occur in customer buying behavior. In this case, it is desirable to

determine frequent patterns of buying behavior, because they can be used for

making decision about shelf stocking or recommendations.

•

Other Data Mining Problems: Frequent pattern mining can be used to enable other

major data mining problems such as classification, clustering and outlier analysis

[ 11 , 52 , 73 ]. This is because the use of frequent patterns is so fundamental in the

analytical process for a host of data mining problems.

•

Web Mining: In this case, the Web logs may be processed in order to determine

important patterns in the browsing behavior [ 24 , 63 ]. This information can be

used for Web site design. recommendations, or even outlier analysis.

•

Software Bug Analysis: Executions of software programs can be represented as

graphs with typical patterns. Logical errors in these bugs often show up as specific

kinds of patterns that can be mined for further analysis [ 41 , 51 ].

•

Chemical and Biological Analysis: Chemical and biological data are often rep-

resented as graphs and sequences. A number of methods have been proposed in

the literature for using the frequent patterns in such graphs for a wide variety of

applications in different scenarios [ 8 , 29 , 41 , 42 , 69 - 75 ].

Since the publication of the original article on frequent pattern mining [ 10 ], numerous

techniques have been proposed both for frequent and sequential pattern mining [ 5 ,

4 , 13 , 33 , 62 ]. Furthermore, many variants of frequent pattern mining, such as

Frequent Pattern Mining

Search WWH ::

Custom Search

Home