Database Reference
In-Depth Information
Both RapidMiner and OpenOffice provide intuitive graphical user interface environments
which make it easier for general computer-using audiences to the experience the power
of data mining.
All examples using OpenOffice or RapidMiner in this topic will be illustrated in a Microsoft
Windows environment, although it should be noted that these software packages will work on a
variety of computing platforms. It is recommended that you download and install these two
software packages on your computer now, so that you can work along with the examples in the
topic if you would like.
OpenOffice can be downloaded from: http://www.openoffice.org/
RapidMiner Community Edition can be downloaded from:
http://rapid-i.com/content/view/26/84/
THE DATA MINING PROCESS
Although data mining's roots can be traced back to the late 1980s, for most of the 1990s the field
was still in its infancy. Data mining was still being defined, and refined. It was largely a loose
conglomeration of data models, analysis algorithms, and ad hoc outputs. In 1999, several sizeable
companies including auto maker Daimler-Benz, insurance provider OHRA, hardware and software
manufacturer NCR Corp. and statistical software maker SPSS, Inc. began working together to
formalize and standardize an approach to data mining. The result of their work was CRISP-DM ,
the CRoss-Industry Standard Process for Data Mining. Although
the participants in the creation of CRISP-DM certainly had vested interests in certain software and
hardware tools, the process was designed independent of any specific tool. It was written in such a
way as to be conceptual in nature—something that could be applied independent of any certain
tool or kind of data. The process consists of six steps or phases, as illustrated in Figure 1-1.
 
Search WWH ::




Custom Search