enable extending the existing JDM functionality and adding new
functionality. For example, a data mining vendor who provides
non-JDM functionality, such as variations of the decision tree algo-
rithm with additional settings, time series function, or a new genetic
algorithm for regression , can extend the existing JDM interfaces by
adding new interfaces/classes to support that functionality.
As discussed in Chapter 8, JDM packages are organized by mining
function and mining algorithm. Settings interfaces are defined within
each function and algorithm package. JDM implementations can add
new functions and algorithms by defining new packages and includ-
ing interfaces that inherit from the appropriate base interfaces. For
example, a new time series mining function can be defined in a new
package com.xyzMiner.jdm.timeseries with the time series build settings
inheriting from the javax.datamining.base.BuildSettings interface.
When defining a new mining function or other feature, the appro-
priate enumeration must be updated. For example the MiningFunc-
tion enumeration, which lists the functions supported in the JDM
standard, must be updated to include and entry for time series . To
add vendor-specific extension enumerations to the standard exten-
sions, JDM provides a static method called addExtension(String enum)
in all enumerations.
Using the TCK
To validate that a given JDM implementation conforms to the stan-
dard, the implementer executes the Technology Compatibility Kit
(TCK) on the new implementation. The TCK is available at the JDM
Web site (http://www.jcp.org/en/jsr/detail?id
73). (From the latest
Maintenance Release, follow the download instructions for “JSR073
for Implementation.” The download will contain the latest specifica-
tion document and the latest versions of the TCK and RI.)
This TCK includes a configuration file that the implementer mod-
ifies to conform to implementation-specific URI formats for datasets.
It also includes Java package names where the TCK will find the
proper connection factory implementations. The TCK runs a series of
tests depending on the capabilities declared by the DME implemen-
tation and generates an HTML file containing a compatibility report.
Users of the TCK should realize that it is not a full unitary test
suite. For example, the TCK does not validate that individual compu-
tations are correct, such as the standard deviation of a specific
attribute returned by a ComputeStatistics task. It does not test the results