An Overview of Data Mining Techniques - Data Mining Techniques in CRM: Inside Customer Segmentation

Database Reference

In-Depth Information

in use. The existing list may not cover all types of potential fraud and may

need to be appended to the results of random audits. In conclusion, both the

supervised and unsupervised approaches for detecting fraud have pros and

cons. A combined approach is the one that usually yields the best results.

MACHINE LEARNING/ARTIFICIAL INTELLIGENCE

VS. STATISTICAL TECHNIQUES

According to their origin and the way they analyze data patterns, the data mining

models can be grouped into two classes:

• Machine learning/artificial intelligence models

• Statistical models.

Statistical models include algorithms like OLSR, logistic regression, factor

analysis/PCA, among others. Techniques like decision trees, neural networks,

association rules, self-organizing maps are machine learning models.

With the rapid developments in IT in recent years, there has been a rapid

growth in machine leaning algorithms, expanding analytical capabilities in terms

of both efficiency and scalability. Nevertheless, one should never underestimate

the predictive power of ''traditional'' statistical techniques whose robustness and

reliability have been established and proven over the years.

Faced with the growing volume of stored data, analysts started to look for

faster algorithms that could overcome potential time and size limitations. Machine

learning models were developed as an answer to the need to analyze large amounts

of data in a reasonable time. New algorithms were also developed to overcome

certain assumptions and restrictions of statistical models and to provide solutions

to newly arisen business problems like the need to analyze affinities through

association modeling and sequences through sequence models.

Trying many different modeling solutions is the essence of data mining. There

is no particular technique or class of techniques which yields superior results

in all situations and for all types of problems. However, in general, machine

learning algorithms perform better than traditional statistical techniques in regard

to speed and capacity of analyzing large volumes of data. Some traditional statistical

techniques may fail to efficiently handle wide (high-dimensional) or long datasets

(many records). For instance, in the case of a classification project, a logistic

regression model would demand more resources and processing time than a

Search WWH ::

Custom Search

Home