Database Reference
In-Depth Information
A full explanation of these administration commands is beyond the scope of this chapter, but by using the
dfsadmin command you can manage quotas, control the upgrade, refresh the nodes, and enter safe mode. Check the
Hadoop site hadoop.apache.org for full information.
Summary
In this chapter you have been introduced to both Hadoop V1 and V2 in terms of their installation and use. It is hoped
you can see that, by using the CDH stack release, the installation process and use of Hadoop are much simplified.
In the course of this chapter you have installed Hadoop V1 manually via a download package from the Hadoop
site. You have then installed V2 and YARN via CDH packages and the yum command. Servers for HDFS and YARN are
started as Linux services in V2 rather than as scripts, as in V1. Also, in the CDH release logs, binaries and configuration
functions were separated into their own, specific directories.
You have been shown the same Map Reduce task as run on both versions of Hadoop. Task run times were
comparable between V1 and V2. However, V2 offers the ability to have a larger production cluster than does V1.
(In the following chapters you will look at Map Reduce programming in Java and Pig).
You have also configured Hadoop V2 across a mini cluster with name nodes and data nodes on different servers.
You have installed and used ZooKeeper, setting up a quorum and using the client. (In the next chapter, HBase—the
Hadoop database—will be discussed and that calls upon ZooKeeper).
Lastly, you have looked at the command set for file system and for user and administration commands. True, it
was only a brief look, but further information is available at the Hadoop website.
 
Search WWH ::




Custom Search