required knowledge about any algorithm implementation
• KJDM functions work directly on unprepared data, which is
one of the key elements explaining the productivity
improvement obtained by using KXEN. No need exists for
specific processing of missing values, outliers, numeric,
ordinal, or even categorical variables with high cardinality,
such as ZIP code.
• KJDM classification, regression, and clustering functions
work in very high-dimensional space without any need of a
priori attribute selection: it is not uncommon to build classi-
fication models on 2,000 attributes for CRM applications
with very good performance.
• KJDM can export classification, regression, and clustering
models into a wide range of languages for optimized use of
external scoring engines.
More information can be found at [KJDM 2006].
Table 16-6 shows the KJDM-supported functions and tasks. KJDM
also supports export and import tasks for exporting and importing
mining models in the KXEN proprietary format.
In KJDM, all named objects can be persisted in the MOR except
taxonomy. Since persistence can be heavy on the management of the
persisted objects, it can be useful to keep some classes of objects tran-
sient. KJDM enables this, and the process can be fine-tuned through a
The application designer can use specific parameters in the URI
specification to indicate the location of the MOR, as shown in the
Logical description can be used to shield the physical descrip-
tions from the models in KJDM: all data mining functions can use
logical data and logical attributes, and can be used on discrete,
bounded, ordinal, unprepared, numerical, and categorical
attributes. Even if attributes are declared to be prepared, KXEN will
In KJDM, there is no option for “outlier treatment” because
KXEN proprietary algorithms are designed to resist perturbations
due to outliers in the build datasets. In KJDM, there also is no
option for “missing value treatment”. Missing values are always