Database Reference
In-Depth Information
CHAPTER ONE:
INTRODUCTION TO DATA MINING AND CRISP-DM
INTRODUCTION
Data mining as a discipline is largely transparent to the world. Most of the time, we never even
notice that it's happening. But whenever we sign up for a grocery store shopping card, place a
purchase using a credit card, or surf the Web, we are creating data. These data are stored in large
sets on powerful computers owned by the companies we deal with every day. Lying within those
data sets are patterns—indicators of our interests, our habits, and our behaviors. Data mining
allows people to locate and interpret those patterns, helping them make better informed decisions
and better serve their customers. That being said, there are also concerns about the practice of
data mining. Privacy watchdog groups in particular are vocal about organizations that amass vast
quantities of data, some of which can be very personal in nature.
The intent of this topic is to introduce you to concepts and practices common in data mining. It is
intended primarily for undergraduate college students and for business professionals who may be
interested in using information systems and technologies to solve business problems by mining
data, but who likely do not have a formal background or education in computer science. Although
data mining is the fusion of applied statistics, logic, artificial intelligence, machine learning and data
management systems, you are not required to have a strong background in these fields to use this
topic. While having taken introductory college-level courses in statistics and databases will be
helpful, care has been taken to explain within this topic, the necessary concepts and techniques
required to successfully learn how to mine data.
Each chapter in this topic will explain a data mining concept or technique. You should understand
that the topic is not designed to be an instruction manual or tutorial for the tools we will use
(RapidMiner and OpenOffice Base and Calc). These software packages are capable of many types
of data analysis , and this text is not intended to cover all of their capabilities, but rather, to
illustrate how these software tools can be used to perform certain kinds of data mining. The topic
3
 
 
Search WWH ::




Custom Search