Database Reference
In-Depth Information
HTTP/1.1" 404 209
10.254.0.55 - - [29/Aug/2008:12:29:16 -0700] "GET /favicon.ico
HTTP/1.1"
404 209
10.254.0.56 - - [29/Aug/2008:12:29:21 -0700] "GET /mapreduce
HTTP/1.1" 301 236
10.254.0.57 - - [29/Aug/2008:12:29:21 -0700] "GET /develop/
HTTP/1.1" 200 2657
10.254.0.58 - - [29/Aug/2008:12:29:21 -0700] "GET
/develop/images/gradient.jpg
HTTP/1.1" 200 16624
10.254.0.59 - - [29/Aug/2008:12:29:27 -0700] "GET /manual/
HTTP/1.1" 200 7559
10.254.0.62 - - [29/Aug/2008:12:29:27 -0700] "GET
/manual/style/css/manual.css
HTTP/1.1" 200 18674
MapReduce
License
Apache License, Version 2.0
Activity
High
Purpose
A programming paradigm for processing big data
Official Page
https://hadoop.apache.org
Hadoop Integration Fully Integrated
MapReduce was the first and is the primary programming framework for developing applica-
tions in Hadoop. You'll need to work in Java to use MapReduce in its original and pure form.
You should study WordCount, the “Hello, world” program of Hadoop. The code comes with
all the standard Hadoop distributions. Here's your problem in WordCount: you have a dataset
that consists of a large set of documents, and the goal is to produce a list of all the words and
the number of times they appear in the dataset.
MapReduce jobs consist of Java programs called mappers and reducers . Orchestrated by the
Hadoop software, each of the mappers is given chunks of data to analyze. Let's assume it
gets a sentence: “The dog ate the food.” It would emit five name-value pairs or maps:
“the”:1, “dog”:1, “ate”:1, “the”:1, and “food”:1. The name in the name-value pair is the
Search WWH ::




Custom Search