Database Reference
In-Depth Information
like the local job runner, these allow testing against the full HDFS, MapReduce, and
YARN machinery. Bear in mind, too, that node managers in a mini-cluster launch separate
JVMs to run tasks in, which can make debugging more difficult.
TIP
You can run a mini-cluster from the command line too, with the following:
% hadoop jar \
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-*-tests.jar \
minicluster
Mini-clusters are used extensively in Hadoop's own automated test suite, but they can be
used for testing user code, too. Hadoop's ClusterMapReduceTestCase abstract
class provides a useful base for writing such a test, handles the details of starting and stop-
ping the in-process HDFS and YARN clusters in its setUp() and tearDown() meth-
ods, and generates a suitable Configuration object that is set up to work with them.
Subclasses need only populate data in HDFS (perhaps by copying from a local file), run a
MapReduce job, and confirm the output is as expected. Refer to the MaxTemperat-
ureDriverMiniTest class in the example code that comes with this topic for the list-
ing.
Tests like this serve as regression tests, and are a useful repository of input edge cases and
their expected results. As you encounter more test cases, you can simply add them to the
input file and update the file of expected output accordingly.
Search WWH ::




Custom Search