Database Reference
In-Depth Information
methods to tools, as well as improving integration of the entire DMKD process,
may help a lot to address the existing problems. XML and XML-based technology
provide tools for transforming DM methods into DM tools, combining them into
DM toolboxes, and, most importantly, semiautomating the DMKD process. The
DMKD research community recognizes importance of XML technology for data
preparation and as a medium to store, retrieve, and use the domain knowledge via
the use of PMML [15].
XML provides a universal format for storing structured data. Because it is
supported by current DBMSs it is becoming a standard not only for data transport
but also for data storage. The PMML language can be used to transmit and store
metadata. It is one of the technologies that can substantially simplify the design of
complete DMKD systems and increase their flexibility [36]. Hence we predict the
creation of metadata repositories (knowledge repositories) that would use the
PMML format to store their content. SOAP and XML-RPC are two
communication protocols that are not only platform-independent but that also
eliminate the need for direct API calls, make the communication easy, and support
compatibility between applications that exchange data. Because these protocols
are loosely coupled, one can communicate in a developer- and user-friendly
manner; say, between applications written in C++ on the Linux operating system
and another application written in COBOL on the Windows system. Traditional
communication protocols based on COM, DCOM, and CORBA models are tightly
coupled, which makes development of the integration procedures not only very
difficult, but also inefficient and costly [5]. On the other hand, the SOAP
communication protocol is seamless in terms of implementation because most of
the software development packages already offer libraries that support this
technology. As a result it is very easy to communicate between DM tools and the
DM toolbox using these protocols. The UDDI is another technology that enables
building flexible DM toolboxes. By using it we can build online toolboxes that can
dynamically search, access, and use DM tools that are published as Web services.
OLE DB-DM is the technology that allows the use of DM algorithms within the
existing DBMS products while avoiding problems of interfacing between DM
tools and the DBMSs.
These technologies can, and we think will, be used to support all stages of the
DMKD process. Figure 1.5 shows the DMKD model based on these technologies,
which supports semiautomation of the DMKD process.
The database and knowledge database can be stored using a single DBMS
that supports XML, because the PMML used to store the knowledge complies
with the XML format. We separate the two to underscore the difference in format
and functionality of the information they store. The database is used to store and
query the data. All of the DMKD steps, however, can store information and
communicate using the knowledge database. The advantages of implementing the
knowledge database are: automation of knowledge storage and retrieval, sharing
of the discovered knowledge between different domains, and supporting
semiautomation of two DMKD steps: understanding the data and preparation of
the data. The architecture shown in Fig. 1.5 has the advantage of supporting the
iterative and interactive aspects of the DMKD process. It simply makes sense to
support the entire DMKD process rather than only a single DM step.
Search WWH ::




Custom Search