Database Reference
In-Depth Information
8
Putting It Together:
MapReduce Data Pipelines
It's kind of fun to do the impossible.
—Walt Disney
H uman brains aren't very good at keeping track of millions of separate data points,
but we know that there is lots of data out there, just waiting to be collected, analyzed,
and visualized. To cope with the complexity, we create metaphors to wrap our heads
around the problem. Do we need to store millions of records until we figure out what
to do with them? Let's file them away in a data warehouse . Do we need to analyze a bil-
lion data points? Let's crunch it down into something more manageable.
No longer should we be satisfied with just storing data and chipping away little bits
of it to study. Now that distributed computational tools are becoming more acces-
sible and cheaper to use, it's more and more common to think about data as a dynamic
entity, f lowing from a source to a destination. In order to really gain value from our
data, it needs to be transformed from one state to another and sometimes literally
moved from one physical location to another. It's often useful to think about looking
at the state of data while it is moving from one state to another. In order to get data
from here to there, just like transporting water, we need to build pipelines.
What Is a Data Pipeline?
At my local corner store, there is usually only one person working at any given time.
This person runs the register, stocks items, and helps people find things. Overall, the
number of people who come into this clerk's store is fairly manageable, and the clerk
doesn't usually get overwhelmed (unless there is a run on beer at 1:55 a.m.). If I were
to ask the shopkeeper to keep track of how many customers came in that day, I am
pretty sure that not only would this task would be manageable, but I could even get an
accurate answer.
The corner store is convenient, but sometimes I want to buy my shampoo and
potato chips in gigantic sizes that will last me half the year. Have you ever shopped in
 
 
 
 
Search WWH ::




Custom Search