Java Reference
In-Depth Information
the standard. Multiple data mining vendors have implemented
PMML in their released products, both those that are active in defin-
ing the standard through the DMG and those interested only in using
PMML.
PMML, from its beginning, supported vendor-specific extensions.
These were crucial to the initial success of the standard, since the core
standard was not designed to cater to the full requirements of any
given vendor but to represent a common set of required capabilities.
However, by allowing vendor-specific extensions in XML, interoper-
ability among vendors suffered, that is, it was not possible for ven-
dors to exchange most models. One objective for model exchange is
to be able to produce PMML in one vendor system (export) and con-
sume it in another vendor's system (import and use). Yet, the seed
was planted, the standard grew, gained acceptance, and continues to
address the problems that limit interoperability. A key factor in
ensuring interoperability will be the development of a conformance
test suite. Then, vendors that claim compliance to the standard can
validate their implementation more fully.
Some vendors produce PMML but do not consume their own or
other vendor's PMML documents. Producer-only vendors allow
third-party vendors to import that PMML for either scoring or visu-
alization. Most vendors that support PMML, however, do produce
and consume their own PMML models.
One of the limitations of an XML representation for data mining
models is the impact that loading the model can have on real-time
performance. Some models, such as association rules or clustering
models can be quite large, involving megabytes or gigabytes of data.
It can take a significant amount of time to load these models into
memory for inspection or scoring. Such overheads may be acceptable
in some situations; in other situations, where real-time response is
required, the load-time requirement is unacceptable, even for rela-
tively small models. Preloading such models and pinning them in
memory help to overcome this concern. Other standards that provide
support for applying models, such as SQL/MM DM and JDM, pro-
vide operations to load and pin a model in memory to facilitate real-
time scoring, where multiple invocations will be made.
Over multiple releases, PMML has increased its breadth of model
types supported, to the point where, today, quite a range of data
mining models is supported. PMML continues to expand into new
model representations while attempting to address the issue of
transformations uniformly across models. PMML also strives to
Search WWH ::




Custom Search