Java Reference
In-Depth Information
Table 17-1
Data mining standards terminology comparison
JDM
PMML
CWM
SQL/MM
Attribute
MiningField
MiningAttribute
Column
{Y}Model
{X}ModelDetail
{X}Model
MiningModel
DM_{Y}Model
ModelSignature
MiningSchema
ApplicationInputSpecification
DM_getFields ( )
BuildSettings
{Y}Settings
N/A
MiningSettings
{Y}Settings
DM_{Y}Settings
PhysicalDataSet
DataDictionary
MiningDataSpecification
DM_MiningData
TestMetrics
N/A
MiningModelResult
DM_{Y}Result
Ta rg e t
Predicted
Ta rg e t
Ta rg e t
ApplySettings
N/A
N/S
N/S
LogicalData
N/A
AttributeUsageSpecification
MiningDataSpecification
DM_LogicalDataSpec
X - {Tree, NeuralNetwork, etc.}
Y - {Classification, Regression, Clustering, etc.}
N/A Not
applicable
N/S Not specified
PMML takes an algorithm-centered approach since each algo-
rithm typically uses different data structures to maintain model state.
Here the algorithm implementation drives the model representation,
transformations, and algorithm-specific settings. Adding a new algo-
rithm often requires creating a new set of XML Schema representa-
tions that fit into the PMML framework.
JDM takes a functionality-centered approach, focusing on higher-
level mining functions such as classification, regression, and so on,
but also provides for algorithm-specific representations for settings
and model details.
Although terminology does not always coincide, the supported data
mining technologies of the various data mining standards have a sig-
nificant degree of commonality. Table 17-2 compares some of the more
prominent features of JDM, PMML, CWM, and SQL/MM in terms of
their support of various data mining and related technologies.
The other data mining standards—PMML, SQL/MM DM, and
CWM/DM—served as the starting point for the JDM effort. With
JDM, we saw the need to attempt to unify the various standards in
concept, terminology, and structure. Due to the evolutionary nature
Search WWH ::




Custom Search