Databases Reference
In-Depth Information
Figure 4.4: Freebase - ontology-based data ( www.freebase.com ) .
Because this information is structured, it is available for machine processing. The project
DBpedia has done precisely this, creating an RDF repository from the Wikipedia info boxes. This
can then be mixed with other online data or queried on its own using the RDF query language
SPARQL (see Figure 4.6 ).
Not all data are textual or numerical - maps and geographic information have become in-
creasingly important on the web. Whereas GIS (geographic information systems) used to be a very
specialised area of database systems, the combination of mapping mash-ups such as Google maps
and GPS availability on mobile phones and digital cameras have led to map-focused interactions
becoming commonplace. Some of this information comes from traditional mapping and GIS data
sources; however, others come from more 'web-like' sources: Google finds addresses in web pages
and OpenStreeMap uses a Wikipedia-like model, inviting anyone to add features to its maps (see
Figure 4.7 ), which are then available to all through a Creative Commons licence.
4.1.4 WEB TECHNOLOGY AND THE WEB OF DATA
The web has spawned many technologies, but most significant for data management are XML, RDF,
and (currently) to a lesser extent OWL.
Both XML and RDF were technologies originally designed for one purpose and then appro-
priated for another. XML was a text markup notation based on a restricted variant of SGML, which
was originally designed for the publishing industry. Whereas SGML documents required a data
description (a DTD) to be parsed, XML was designed to be syntactically self-describing , thus being
more robust in the open web environment. One example of this is the hierarchical structure of tags.
Search WWH ::




Custom Search