HTML and CSS Reference
Along with these benefits, there are several open issues that need further investigation and, in some cases,
the development of new approaches. The largest challenge of Semantic Web applications is to resolve semantic
data quality problems and identify useful and meaningful information . There are more and more promising
approaches; however, they have a common feature: all rely on standard annotations, taxonomies, vocabularies, and
ontologies. We analyze these essential technologies and their features throughout the chapter from a standardization
point of view.
Data should be structured to support advanced processability and searchability by data type. Structured data is data
organized in a structure to become identifiable. Such data has been used for decades in computing, such as in the form
of Access and SQL databases, where queries can be performed to retrieve information (for example, a ZIP code). In
contrast to relational databases, most data on the Web is stored in (X)HTML documents that contain unstructured data .
Conventional web documents contain large amounts of unstructured data that can be rendered in web browsers.
This approach works satisfactorily for publishing purposes; however, a large amount of data stored in, or associated
with, web documents cannot be processed this way. According to Berners-Lee, the data used to describe social
connections between people is a good example for that kind of data : “The Web is more a social creation than a
technical one. I designed it for a social effect—to help people work together—and not as a technical toy. The ultimate
goal of the Web is to support and improve our weblike existence in the world. We clump into families, associations,
and companies. We develop trust across the miles and distrust around the corner. What we believe, endorse, agree
with, and depend on is representable and, increasingly, represented on the Web. We all have to ensure that the society
we build with the Web is of the sort we intend.”
On the Semantic Web, there is a variety of structured data , usually expressed in, or based on, the Resource
Description Framework (RDF), which will be described later in detail. Similar to conventional conceptual modeling
approaches, such as class diagrams and entity relationships, the RDF data model is based on statements that
describe and feature resources, especially web resources, in the form of subject-predicate-object expressions. The
subject corresponds to the resource. The predicate expresses a relationship between the subject and the object. Such
expressions are called triples .
For example, the statement “The grass is green” can be expressed in an RDF triple as follows:
Subject: “The grass”
RDF is an abstract model that has several serialization formats. Consequently, the syntax of the triple varies from
format to format (see later in the section “Resource Description Framework”).
rdF is a data representation model, not a language like XML.
The authors of the “conventional” Web usually publish unstructured data, because they do not know about
the power of structured data, find RDF too complex, or do not know how to create and publish RDF in any of its
serialization formats. The following are solutions to the problem that add structured data to conventional (X)HTML
markup, which can be extracted by appropriate software and converted to RDF:
Microformats, which reuse markup attributes
Microdata, which extends HTML5 markup with structured metadata
RDFa, which expresses RDF in markup attributes that are not part of (X)HTML vocabularies