An Overview of Data Mining Techniques - Data Mining Techniques in CRM: Inside Customer Segmentation

Database Reference

In-Depth Information

decision tree model. Similarly, a hierarchical or agglomerative cluster algorithm

will fail to analyzemore than a few thousand records when some of themost recently

developed clustering algorithms, like IBM SPSS Modeler TwoStep Model, can

handle millions without sampling. Within the machine learning algorithms we

can also note substantial differences in terms of speed and required resources,

with neural networks, including SOMs for clustering, among the most demanding

techniques.

Another advantage of machine learning algorithms is that they have less

stringent data assumptions. Thus they are more friendly and simple to use for

those with little experience in the technical aspects of model building. Usually,

statistical algorithms require considerable effort in building. Analysts should spend

time taking into account the data considerations. Merely feeding raw data into

these algorithms will probably yield poor results. Their building may require special

data processing and transformations before they produce results comparable or

even superior to those of machine learning algorithms.

Another aspect that data miners should take into account when choosing a

model technique is the insight provided by each algorithm. In general, statistical

models yield transparent solutions. On the contrary, somemachine learningmodels,

like neural networks, are opaque, conveying little information and knowledge about

the underlying data patterns and customer behaviors. They may provide reliable

customer scores and achieve satisfactory predictive performance, but they provide

little or no reasoning for their predictions. However, among machine learning

algorithms there are models that provide an explanation of the derived results,

like decision trees. Their results are presented in an intuitive and self-explanatory

format, allowing an understanding of the findings. Since most data mining software

packages allow for fast and easy model development, the case of developing one

model for insight and a different model for scoring and deployment is not unusual.

SUMMARY

In the previous sections we presented a brief introduction to the main concepts of

data mining modeling techniques. Models can be grouped into two main classes:

supervised and unsupervised.

Supervised modeling techniques are also referred to as directed or predictive

because their goal is prediction. Models automatically detect or ''learn'' the input

data patterns associated with specific output values. Supervised models are further

grouped into classification and estimation models, according to the measurement

level of the target field. Classification models deal with the prediction of categorical

outcomes. Their goal is to classify new cases into predefined classes. Classification

Search WWH ::

Custom Search

Home