Java Reference
In-Depth Information
structure and feel of the API. This framework enables vendors to use
the core of the standard while reflecting a product's unique capabili-
ties. This framework also enables application developers to learn a
standard set of interfaces yet easily leverage vendor extensions since
they adhere to a common design. This framework is discussed fur-
ther in Chapter 8.
Strategic Objective 5: Start small and grow in functionality
The field of data mining includes a wide variety of techniques. Some
are mature and well-established in both tools and practice. For exam-
ple, classification and regression have been long implemented using
decision tree algorithms such as C4.5 [Quinlan 1993] and CART
[Breiman+ 1984], and other algorithms including naïve bayes and
neural networks [Mitchell 1997]. Newer algorithms such as support
vector machine (SVM) [Christianini/Shawe-Taylor 2000] have also
gained significant acceptance in both tools and applications. Other
techniques are more experimental and evolving, or still proprietary.
As the breadth of data mining is quite encompassing, it is important
to focus initially on a set of techniques commonly available with
well-known applications.
To enable a data mining standard to come to fruition, it is impor-
tant to constrain its scope and focus on a core set of capabilities,
while defining a framework within which new capabilities can be
readily added. Tactically, we provide an initial set of mining func-
tions and algorithms that can solve a wide range of problems and
expand over time as demand dictates.
Strategic Objective 6: Simplify data mining for novices
while allowing control for experts
As we have noted, data mining has traditionally been the domain of
experts. Although there are still many aspects of the data mining
problem space that require in-depth understanding of both the prob-
lem and solution, such as data preparation and domain-dependent
knowledge, much can be done to simplify the data mining process,
including automatic algorithm selection, data preparation, and set-
tings tuning. This automation gives vendors the opportunity to add
value to their products beyond algorithms. Yet at the same time,
experts want to be able to exert control over all aspects of the model-
ing process.
Tactically, we provide an API that allows vendors to automate
much of the data mining process such that both novice and expert
Search WWH ::

Custom Search