Java Reference
In-Depth Information
Gold recovered from the ore through the milling process is poured into bricks
that are shipped to be assayed and sent to the mint in Ottawa, where coins are
struck (made).
Data mining models can be shipped from the lab to the field, or
components of which can be packaged up in reports or dashboards
for management and operations staff. From there, some models
can be used to score new data (i.e., make predictions or classifica-
tions). These scores can be used in applications such as predicting
customer response.
After reviewing this metaphor, you should have a physical
grounding in the data mining process and perhaps have learned
something about gold mining you did not know before.
The Value of Data Mining
The true value of data mining does not reside in a set of complex
algorithms, but in the practical problems that it can help solve. Too
often, data mining solutions are presented through the eyes of the
data analyst—the person who massages and prepares the data and
builds the models—where the emphasis is on the algorithm and tech-
niques used to solve the problem. However, in the business world,
true value is realized with return on investment , when we see that
$2 million was saved for a $300,000 investment to predict which
customers will default on a loan, or when we see that consumer
fraud was reduced 50 percent resulting in a savings of $22 million.
How Reliable Is Data Mining?
For a technology to be truly valuable, it needs to be reliable. Few
technologies are foolproof in practice, including data mining. How-
ever, data mining is based on a firm foundation in mathematics and
statistics. Data mining algorithms withstand tests on both real and
synthetic datasets, where results are rigorously analyzed for accu-
racy and correctness. The reliability of results more often depends on
the availability of sufficient data, data quality, and the technique cho-
sen, as well as the skills of those preparing the data, selecting algo-
rithm parameters, and analyzing the results. If the data provided
contains erroneous values (e.g., false data entered on warranty cards,
or a lot of missing values), data mining algorithms may have diffi-
culty discovering any meaningful patterns in the data. However,
over the past several decades, data mining techniques have been
Search WWH ::

Custom Search