Information Technology Reference
In-Depth Information
The first approach to this problem was given by the KMP string matching
algorithm [14], where the aim is to find a short pattern in a long text. The concept of
meta-pattern was presented in order to enable the representation of a set of basic
patterns in a concise way [22]. The meta-pattern model works in a structured
framework similar to a grammar structure, where the meta-pattern may be composed
of patterns or meta-patterns.
Sequence Mining Discovery involves a sequence of steps in time that can be
exemplified with a customer transaction event log. The frequent temporal pattern
might express buying patterns that many customers exhibit. Most of the Frequent
Sequence Mining algorithms use lattice structures in the search space. These
algorithms include for the breadth-first search, the AprioriAll algorithm [4] and the
GSP (Generalized Sequential Pattern) [19] and for the depth-first search the SPADE
(Sequential PAttern Discovery using Equivalence classes) [23].
Another approach to discover temporal patterns in sequential data is the frequent
episode discovery presented by [15]. In this sequential pattern framework, a set of
events is given and the aim is to discover frequent sequences, or frequent episodes,
that occur in the timeline. The data referred here as an event sequence are denoted by
E={(e(1),t(1)), (e(2),t(2)), …(e(n), t(n))} where e(i) takes values from an event-type
set and t(i) is an integer denoting the time stamp of the i th event.
The Markov chain is an acyclic graph with a set of states associated with a set of
transitions between states. In the market basket each state corresponds to an item and
in the user web navigation each state is a page. Markov models have been used to
represent and analyze users' web navigation data [5]. At each time interval the system
can change from a current state to the next state. The transition between states is
quantified using probabilities. For each state in the transition matrix P(i,j) is the
probability of moving from state i to state j in one step, and the sum of the
probabilities of each state is equal to one. The first-order chain, is usually represented
by a square matrix of the probabilities of the transitions from the current states to the
following states. A second-order Markov chain takes into account the previous and
the current state probability of the transition to the next state. The nth-order matrix,
with n
1, grows into new dimensions ceasing to be square, as the first-order matrix.
The higher-order chains consider sets of previous states leading to ordered sequences
of states, while expanding the dimension of the original matrix and also the
complexity of the problem.
>
2.2
Process Mining
Process Mining [1] is a recent approach that aims to create a bridge between Data
Mining and Business Process Modelling (BPM) by discovering paths in event log
data.
The Petri nets technique is an established tool for modelling processes that is often
used in Process Mining. Petri nets like the Markov chain allows an aggregate view of
the process workflow.
The Process Mining Manifesto [3] is supported by 53 organizations, 77 expert
consultants and is available in 13 languages. The manifesto presents six guiding
principles and eleven challenges, clarifying the aim of this new concept.
Search WWH ::




Custom Search