Web Content Management - High-Performance Web Databases: Design, Development, and Deployment - page 92

Databases Reference

In-Depth Information

in importance as the tools for creating content, especially those that will

store the content in XML format, become more sophisticated.

The third category of functionality is publishing. This could cover pub-

lishing to any medium, including paper, personal devices, or the Web. Simple

content is rarely published by itself. Usually, a document is published.

Therefore, one of the first steps in publishing involves assembling the

constituent components of the document. Assembly is different from the

management of the aggregated components because management involves

managing a set of pointers, whereas publishing involves assembling copies

of the content into a unified whole. The next step after assembly is applying

presentation formatting. This is done through the use of a template. The

publishing system must be able to understand the unstructured data

within each type of content component to some degree, to be able to apply

the formatting and to be able to reproduce the formatted content in a

meaningful way. As discussed previously, understanding the unstructured

data can be complicated.

Publishing to the Web could almost be a category unto itself. In addition

to the functionality for general publishing, publishing to the Web can

require special hardware and software for Internet scalability. Special soft-

ware may be required for the synchronization of multiple Web servers if

more than one Web server is involved. Depending on the objective of pro-

viding the content over the Web, there may also be various application

servers involved. For example, there may be commerce servers for buying

and selling, or personalization servers to enable individuals to have a

unique personal experience on a particular Web site. These application

servers will require some integration with their indexing schemas.

As demonstrated, both content and documents are comprised of

unstructured data. Unstructured data usually requires special tools, algo-

rithms, and methodologies to effectively manage it. Standard IT tools such

as relational databases by themselves are not effective.

All database management systems need to understand the content of

the data in order to generate the indices that are used to store and retrieve

the data. Because the computer cannot understand the content of unstruc-

tured data at the operating system level, it cannot generate the indices.

Other tools, in addition to those provided in the standard IT toolkit, are

required. One of the most common tools is the use of metadata or data

describing the data (content). Metadata, however, needs to be generated

by some intelligence that has at least a partial understanding of the mean-

ing of the content. Metadata is stored externally as structured data in a

relational database management system. The computer then uses this

structured data and pointers to the content in the form of access paths pro-

vided by the operating system file access method, to manage information

stored in the unstructured portion.

Next Page

High-Performance Web Databases: Design, Development, and Deployment

Search WWH ::

Custom Search

Home