Metadata and the Semantic Web - Web Standards: Mastering HTML5, CSS3, and XML

HTML and CSS Reference

In-Depth Information

Along with these benefits, there are several open issues that need further investigation and, in some cases,

the development of new approaches. The largest challenge of Semantic Web applications is to resolve semantic

data quality problems and identify useful and meaningful information [17]. There are more and more promising

approaches; however, they have a common feature: all rely on standard annotations, taxonomies, vocabularies, and

ontologies. We analyze these essential technologies and their features throughout the chapter from a standardization

point of view.

Structured Data

Data should be structured to support advanced processability and searchability by data type. Structured data is data

organized in a structure to become identifiable. Such data has been used for decades in computing, such as in the form

of Access and SQL databases, where queries can be performed to retrieve information (for example, a ZIP code). In

contrast to relational databases, most data on the Web is stored in (X)HTML documents that contain unstructured data .

Conventional web documents contain large amounts of unstructured data that can be rendered in web browsers.

This approach works satisfactorily for publishing purposes; however, a large amount of data stored in, or associated

with, web documents cannot be processed this way. According to Berners-Lee, the data used to describe social

connections between people is a good example for that kind of data [18]: “The Web is more a social creation than a

technical one. I designed it for a social effect—to help people work together—and not as a technical toy. The ultimate

goal of the Web is to support and improve our weblike existence in the world. We clump into families, associations,

and companies. We develop trust across the miles and distrust around the corner. What we believe, endorse, agree

with, and depend on is representable and, increasingly, represented on the Web. We all have to ensure that the society

we build with the Web is of the sort we intend.”

On the Semantic Web, there is a variety of structured data , usually expressed in, or based on, the Resource

Description Framework (RDF), which will be described later in detail. Similar to conventional conceptual modeling

approaches, such as class diagrams and entity relationships, the RDF data model is based on statements that

describe and feature resources, especially web resources, in the form of subject-predicate-object expressions. The

subject corresponds to the resource. The predicate expresses a relationship between the subject and the object. Such

expressions are called triples .

For example, the statement “The grass is green” can be expressed in an RDF triple as follows:

•

Subject: “The grass”

•

Predicate: “is”

•

Object: “green”

RDF is an abstract model that has several serialization formats. Consequently, the syntax of the triple varies from

format to format (see later in the section “Resource Description Framework”).

■

Caution

rdF is a data representation model, not a language like XML.

The authors of the “conventional” Web usually publish unstructured data, because they do not know about

the power of structured data, find RDF too complex, or do not know how to create and publish RDF in any of its

serialization formats. The following are solutions to the problem that add structured data to conventional (X)HTML

markup, which can be extracted by appropriate software and converted to RDF:

•

Microformats, which reuse markup attributes

•

Microdata, which extends HTML5 markup with structured metadata

•

RDFa, which expresses RDF in markup attributes that are not part of (X)HTML vocabularies

Web Standards: Mastering HTML5, CSS3, and XML

Search WWH ::

Custom Search

Home