Database Reference
In-Depth Information
semiautomate the DMKD process several technologies are necessary:
z a data repository that stores the data, background knowledge, and models;
z protocols for data and information exchange between data repositories and
DM tools and between different DM tools; and
z standards for describing the data and models.
XML technology has been studied and used extensively over the last years,
along with other technologies built on top of XML, like PMML, XML-RPC,
SOAP, and UDDI. Together they can provide solutions to the problem of
semiautomating the DMKD process. In what follows, these technologies are
introduced and their applications within the DMKD process are described. In
addition, technologies like OLAP and OLE-DB DM and their impact on DMKD
process are also discussed.
1.3.1. XML
XML is a markup language for documents that contain structured information.
Structured information consists of content (numbers, character strings, images, etc.)
and information of what role that content plays, i.e., context of the information
(e.g., a rule is built out of selectors, and a selector is a pair of attributes (name and
value)). XML defines a standard to add markup or to identify structures in
documents.
XML is primarily used to create, share, and process information. XML
enables users to define tags (element names) that are specific to a particular
purpose. XML tags are used to describe the meaning or context of the data in a
precisely defined manner. It is the information modeling features of XML that
made it popular. Thanks to these features, processing of XML documents can be
performed automatically.
XML technology is widely used in industry to transfer and share information.
One of the most important properties of XML is that the current database
management systems (DBMS) support the XML standard. From the DMKD point
of view this means that XML can be used as a transport medium between DM
tools and XML-based knowledge repositories, which are used to store discovered
knowledge and information about the data and the DBMS that store the data.
There are two major kinds of DBMS that can handle XML documents: XML-
native DBMS, and XML-enabled DBMS:
z The majority of XML-native DBMS are based on the standard DB physical
storage model, like relational, object-relational, or object-oriented, but they
use XML documents as the fundamental storage unit, just as relational DBMS
uses tuples as its fundamental storage unit. Their main advantage lies in the
possibility of storing an XML document and then retrieving the same
document without losing any information, both on structural and data levels
(not yet possible using the XML-enabled DBMS). The two well-known
XML-native DBMS are: Lore [49] and Tamino [62]. XML-native DBMSs
can be divided into two groups: created over the relational model (examples
include DBDOM, eXist, Xfinity, and XML Agent) and created over the
Search WWH ::




Custom Search