Validation - Web Standards: Mastering HTML5, CSS3, and XML

HTML and CSS Reference

In-Depth Information

Extracting Semantic Content

Semantic content of web sites can be checked with the W3C Semantic Data Extractor [31]. It can extract semantic data

such as following:

•

Generic metadata

•

Title, author, and description provided in the document head

•

RDFa metadata embedded in the document body (also generated in RDF/XML)

•

Related resources

•

Linked files, for example, RSS or Atom news feeds

•

Glossary, copyright, and bookmarkable points provided in the document head

•

Outline of the document

•

Quotes and citations

Menu points and URIs are provided with hyperlinks.

Another comprehensive semantic data extractor tool is the Sindice Web Data Inspector at http://inspector.

sindice.com [32]. The tool can be used to extract RDF triples from markup, RDF/XML, Turtle, or N3 documents

provided either by URI or by direct input. Sindice Web Data Inspector can be used for retrieving semantic data

(Inspect button), combined semantic data extraction and validation (Inspect + Validate button), or ontology analysis

and reasoning (Figure 14-14 ).

Figure 14-14. Comprehensive options on the start screen of Sindice Web Data Inspector

Search WWH ::

Custom Search

Home