Database Reference
In-Depth Information
Figure 13.4 Amazon EMR job parameters
Table 13.1 Amazon EMR Job Parameters
Parameter/
Argument
Description
Value
Input
Location
The Amazon S3 bucket containing
the input data for your processing
job
bluewatersql/
input
Output
Location
The Amazon S3 bucket that will be
used to write the results to
bluewatersql/
output
Mapper
The Amazon S3 bucket location
and name of mapper job
bluewatersql/
wordsplitter.py
Reducer
The Amazon S3 bucket location
and name of the reducer job
aggregate
Extra Args
Any additional arguments that are
required for the Hadoop
Streaming job
5. Now you are ready to specify the number of EC2 instances that form the
nodes within your Hadoop cluster. You need to configure three instance
types:
Master node : The head node is responsible for assigning and
coordinating work among core and task nodes. Only a single
instance of the master node is created.
 
 
Search WWH ::




Custom Search