Database Reference
In-Depth Information
2 Linked Data and the
Semantic Web
2.1 INTRODUCTION
This chapter gives an overview of the Semantic Web mission to add meaning to the
World Wide Web, and it introduces the main concepts involved. To set the topic in
context, we cover the early history of the Semantic Web as it developed from the
World Wide Web. Its emergence was rooted in the need to provide more meaningful
search results based on a real understanding of what the Web page was about: its
semantics . In the early days, there was a tendency to focus on modeling high-level
abstract concepts in an attempt to create generalized models of the world, often
using first- or higher-order expressive logics. These attempts to “model the whole
world” ran into problems of scope creep and complexity, so later work focused on
the development and use of tractable subsets of first-order logic and ontologies that
were designed with a specific purpose in mind.
This chapter explains the main benefits of the Semantic Web, including its use
in data integration and repurposing, classification, and control. We also explain the
relationship between the Semantic Web and Linked Data, preparing the reader for
further chapters that cover the process of publishing information as Linked Data and
authoring Semantic Web domain descriptions known as ontologies.
2.2 FROM A WEB OF DOCUMENTS TO A WEB OF KNOWLEDGE
The fundamental unit of the World Wide Web is the document. Each Web page is
basically a document that is connected to other documents via hyperlinks. A user
searches for information or finds the answer to a question by reading a Web page that
they hope contains the information sought. This Web page will have been retrieved
by a search engine, which ranks the relevance of the pages by analyzing the links
between them. One example of this is Google's PageRank algorithm (Brin and Page,
1998), which measures the relative importance of a page within the set of all Web
pages. This importance measure depends first on the number of links to the page
and second on the PageRanks of the pages that display those incoming hyperlinks.
Therefore, a page that is pointed to by many pages that themselves have a high
PageRank will also earn a high rank. A document's PageRank is the probability
that a Web user clicking on links at random will arrive at that document. What this
means, then, is that there is no understanding within the search engine about what
knowledge that Web page contains, and algorithms based on link analysis cannot
distinguish between synonyms. For example, a search for the word bank will return
information about either a financial bank or a river bank, whichever is the most
popular and linked to.
9
 
Search WWH ::




Custom Search