Databases Reference
In-Depth Information
1
Introducing Hadoop
This chapter covers
The basics of writing a scalable,
distributed data-intensive program
Understanding Hadoop and MapReduce
Writing and running a basic MapReduce program
Today, we're surrounded by data. People upload videos, take pictures on their
cell phones, text friends, update their Facebook status, leave comments around
the web, click on ads, and so forth. Machines, too, are generating and keeping
more and more data. You may even be reading this topic as digital data on your
computer screen, and certainly your purchase of this topic is recorded as data with
some retailer. 1
The exponential growth of data first presented challenges to cutting-edge
businesses such as Google, Yahoo, Amazon, and Microsoft. They needed to go
through terabytes
of data to figure out which websites were popular,
what topics were in demand, and what kinds of ads appealed to people. Existing
tools were becoming inadequate to process such large data sets. Google was the first
to publicize MapReduce— a system they had used to scale their data processing needs.
and petabytes
1 Of course, you're reading a legitimate copy of this, right?
3
 
 
Search WWH ::




Custom Search