Database Reference
In-Depth Information
System
.
exit
(
job
.
waitForCompletion
(
true
) ?
0
:
1
);
}
}
Running a Distributed MapReduce Job
The same program will run, without alteration, on a full dataset. This is the point of
MapReduce: it scales to the size of your data and the size of your hardware. Here's one
data point: on a 10-node EC2 cluster running High-CPU Extra Large instances, the pro-