Information Technology Reference
In-Depth Information
the proximity of the search terms to one another' (Blachman and Peek,
2007). Google's use of external links to assess relevance is particularly
noteworthy, and while it is far from a foolproof system, it is much more
effective than traditional relevance ranking, which is usually based on
the prevalence of search terms. To illustrate the difference: when
processing a search for the terms 'westminster', 'abbey' and 'london', a
search engine using traditional relevance ranking will identify as many
websites as possible containing those terms, and will place at the top of
the result set those websites that contain the most repetitions of those
three words. Google will also gather as many websites as possible, but
will place at the top of the result set those sites to which many other
sites have linked. This ranking method does a far better job than
prevalence ranking does of identifying sites that are considered by other
websites to be authoritative. So, while a search for 'westminster',
'abbey' and 'london' on a traditional search engine will identify the site
that talks most about those terms, Google will show you the site that
is most broadly considered to be an authoritative one as regards those
terms - and thereby almost guarantees that the top search result will
be, in fact, the official website of Westminster Abbey in London.
The reason for explaining, however simplistically and partially, the
logic behind Google's ranking strategies is to compare them with the
organizational strategies used by libraries. Libraries traditionally take a
completely different approach: whereas the researcher using Google
begins by interrogating an astronomically huge number of documents,
a subset of which is then organized by Google into an ingeniously
prioritized search result, libraries gather selected documents first, then
allow patrons to search the selection indirectly by means of proxy
documents (catalogue records). In the traditional library, documents not
owned by the library are not available for searching and the documents
owned by the library are not searchable directly; only their proxies can
be searched, and only by means of library-specified criteria such as
author, title, standardized subject heading or (more recently) keyword.
Before the mass migration of information production into the digital
realm and the large-scale digitization of previously existing analogue
documents into digital formats, the traditional library approach was the
only feasible one. There was simply no way to search millions of pages of
printed text without actually reading them - and while reading a lot of
books might be a very good way to gain command of a broad corpus of
Search WWH ::




Custom Search