Database Reference
In-Depth Information
(2) Fraud Detection — Oxford English Dictionary defines fraud as “An
act or instance of deception, an artifice by which the right or interest
of another is injured, a dishonest trick or stratagem.” Fraud detec-
tion aims to identify fraud as quickly as possible once it has been
perpetrated.
(3) Churn Detection — This application helps sellers to identify customers
with a higher probability of leavingand potentially moving to a
competitor. By identifying these customers in advance, the company
can act to prevent churning (for example,offeringabetterdealtothe
consumer).
Each application is built by accomplishing one or more machine
learning tasks. The second layer in our four layers model is dedicated to
the machine learning tasks, such as: Classification, Clustering, Anomaly
Detection, Regression etc. Each machine learning task can be accomplished
by various machine learning models as indicated in the third layer. For
example, the classification task can be accomplished by the following two
models: Decision Trees or Artificial Neural Networks. In turn, each model
can be induced from the training data using various learning algorithms.
For example, a decision tree can be built using either C4.5 algorithm or
CART algorithm that will be described in the following chapters.
1.4 Knowledge Discovery in Databases (KDD)
KDD process was defined by [Fayyad et al . (1996)] as “the nontrivial process
of identifying valid, novel, potentially useful, and ultimately understandable
patterns in data.” Friedman (1997a) considers the KDD process as an
automatic exploratory data analysis of large databases. Hand (1998) views
it as a secondary data analysis of large databases. The term “Secondary”
emphasizes the fact that the primary purpose of the database was not data
analysis. Data Mining can be considered as the central step for the overall
process of the KDD process. Because of the centrality of data mining for
the KDD process, there are some researchers and practitioners who use the
term “data mining” as synonymous with the complete KDD process.
Several researchers, such as [ Brachman and Anand (1994) ] , [ Fayyad
et al . (1996) ] and [ Reinartz (2002) ] have proposed different ways of dividing
the KDD process into phases. This topic adopts a hybridization of these
proposals and suggests breaking the KDD process into nine steps as
presented in Figure 1.2. Note that the process is iterative at each step,
which means that going back to adjust previous steps may be necessary. The
Search WWH ::




Custom Search