Information Technology Reference
In-Depth Information
Fig. 11.20. Aerial view of Google head-
quarters, the Googleplex, in Mountain
View, California. The roofs on the build-
ings are covered with solar panels.
Sergey Brin and Larry Page ( B.11.10 ), that formed the basis for the success of
Google ( Fig. 11.20 ). The PageRank algorithm was a step-by-step procedure to cal-
culate an “importance score” for each web page. Instead of just looking at the
content and structure of web pages, Page and Brin analyzed the hypertext link
structure. By combining their importance score from the link analysis with
the content score from traditional indexes, Brin and Page developed a search
engine that almost magically delivered the most useful websites to users.
Larry Page was intrigued by AltaVista's information about the hypertext
links and decided that analyzing the structure of the link data could be valu-
able. To do this, he started downloading as much as possible of the entire World
Wide Web to his computer ( Fig. 11.21 ). Meanwhile, Sergey Brin with his Stanford
adviser, Rajeev Motwani, had been investigating the currently available search
engines and directories. Page and Brin then teamed up on trying to accomplish
Page's goal of downloading as many web pages as possible and analyzing their
link structure. Coming from an academic family, Page had the idea that the
number of links to a web page was similar to the citation count of a scientific
paper. The number of times other authors cite a paper is a significant indicator
of the importance of the research described in the paper. However, Page realized
that just counting the number of links pointing to a web page does not give the
full measure of the importance of the page. Just as citations to a scientific paper
from Nobel Prize recipients are more significant than citations by ordinary mor-
tals, so too were links to a web page coming from an important or authoritative
site. The journalist and author David Vise describes Page's idea as follows:
Fig. 11.21. Larry Page and Sergey Brin's
first server at Stanford was encased in
Lego blocks.
B.11.10. Eric Schmidt, Sergey Brin,
and Larry Page (left to right) shown
answering questions in 2008. While
PhD students at Stanford, Sergey and
Larry came up with the idea of rank-
ing the web pages based on their link
structure. They described their ideas
in a much-cited paper “The Anatomy
of a Large-Scale Hypertextual Web
Search Engine.” After unsuccessfully
trying to sell their ideas to AltaVista
and Yahoo!, the two leading web
search companies at that time, they
set up Google, beginning in the tra-
ditional Silicon Valley garage.
All links were not created equal. Some mattered more than others. He would
give greater weight to incoming links from important sites. How would he
decide what sites were important? The sites with the most links pointing to
them, quite simply, were more important than sites with fewer links. 19
In a play on his last name, Page called his new algorithm PageRank.
How do you calculate the PageRank score of a given web page? If we
assign an initial “authority” of 1 to each web page, we can calculate the accu-
mulated authority of a given page by adding up the authorities of all the
web pages that point to it. Unfortunately, the graph of web page links may
contain a “cycle” - that is, by clicking on web links you eventually get back
to the starting point (see Fig. 11.22 ). This makes it impossible to calculate an
authority score for the sites in the cycle. To avoid this problem, Page and Brin
introduced a “random surfer.” Imagine a surfer roaming the web following
 
Search WWH ::




Custom Search