Database Reference
In-Depth Information
have very different interfaces for working with their services, so once you've developed
some automation around the process of building and tearing down these test clusters, you've
effectively locked yourself in with a single service provider. Apache Whirr provides a stand-
ard mechanism for working with a handful of different service providers. This allows you to
easily change cloud providers or to share configurations with other teams that do not use the
same cloud provider.
The most basic building block of Whirr is the instance template. Instance templates define a
purpose; for example, there are templates for the Hadoop jobtracker, ZooKeeper, and HBase
region nodes. Recipes are one step up the stack from templates and define a cluster. For ex-
ample, a recipe for a simple data-processing cluster might call for deploying a Hadoop
NameNode, a Hadoop jobtracker, a couple ZooKeeper servers, an HBase master, and a hand-
ful of HBase region servers.
Tutorial Links
The official Apache Whirr website provides a couple of excellent tutorials. The Whirr in 5
minutes tutorial provides the exact commands necessary to spin up and shut down your first
cluster. The quick-start guide is a little more involved, walking through what happens during
each stage of the process.
Example Code
In this case, we're going to deploy the simple data cluster we described earlier to an Amazon
EC2 account we've already established.
The first step is to build our recipe file (we'll call this file field_guide.properties ):
# field_guide.properties
# The name we'll give this cluster,
# this gets communicated with the cloud service provider
whirr.cluster-name = field_guide
# Because we're just testing
# we'll put all the masters on one single machine
# and build only three worker nodes
whirr.instance-templates = \
1 zookeeper+hadoop-namenode \
+hadoop-jobtracker \
+hbase-master, \
Search WWH ::




Custom Search