Architecture of theWorld WideWeb - Social Semantics: The Search for Meaning on the Web

Information Technology Reference

In-Depth Information

Particular encodings and content are then accepted by or considered valid by the

syntax and semantics of a language respectively (and thus the normative importance

of standardization on the Web in determining these criteria). Also, we do not restrict

our use of the word 'language' to primarily linguistic forms, but use the term

'language' for anything where there is a systematic relationship between syntax

and (even an informal) semantics. For example HTML is a language for mapping

a set of textual tags to renderings of bits on a screen in a web browser. One

principle used in the study of languages, attributed to Frege, is the principle of

compositionality ,where the content of a sentence is related systematically to terms

in which it is composed . Indeed, while the debate is still out if human languages

are truly compositional (Dowty 2007), computer languages almost always are

compositional. In English, the content of the sentence such as 'Tim has a plane

ticket to Paris so he should go to the airport!' can then be composed from the

more elementary content of the sub-statements, such as 'Tim has a plane ticket'

which in turn has its content impacted by words such as 'Tim' and 'ticket.' The

argument about whether sentences, words, or clauses are the minimal building

block of content is beyond our scope. Do note one result of the distinction between

encoding and content is that sentences that are accepted by the syntax (encoding) of

a language, such as Chomsky's famous “Colorless green ideas sleep furiously” may

have no obvious interpretation (to content) outside of the pragmatics of Chomsky's

particular exposition (1957).

2.2.3

Uniform Resource Identifiers

The World Wide Web is defined by the AWWW as “an information space in

which the items of interest, referred to as resources, are identified by global

identifiers called Uniform Resource Identifiers (URI)” (Jacobs and Walsh 2004).

This naming scheme, not any particular language like HTML, is the primary

identifying characteristic of the Web. URIs arose from a need to organize the

“many protocols and systems for document search and retrieval” that were in use

on the Internet, especially considering that “many more protocols or refinements

of existing protocols are to be expected in a field whose expansion is explo-

sive” (Berners-Lee 1994a). Despite the “plethora of protocols and data formats,”

if any system was “to achieve global search and readership of documents across

differing computing platforms,” gateways that can “allow global access” should

“remain possible” (Berners-Lee 1994a). The obvious answer was to consider all

data on the Internet to be a single space of names with global scope.

URIs accomplish their universality over protocols by moving all the information

used by the protocol within the name itself . The information needed to identify any

protocol-specific information is all specified in the name itself: the name of the

protocol, the port used by the protocol, any queries the protocol is responding to, and

the hierarchical structure used by the protocol. The Web is then first and foremost

a naming initiative “to encode the names and addresses of objects on the Internet”

Search WWH ::

Custom Search

Home