Database Reference
In-Depth Information
Knowledge of a good scripting language such as Python, Ruby, or Shell would greatly
help the function of an administrator. Often, administrators are asked to set up some kind
of a scheduled file staging from an external source to HDFS. The scripting skills help
them execute these requests by building scripts and automating them.
Above all, the Hadoop administrator should have a very good understanding of the
Apache Hadoop architecture and its inner workings.
The following are some of the key Hadoop-related operations that the Hadoop adminis-
trator should know:
• Planning the cluster, deciding on the number of nodes based on the estimated
amount of data the cluster is going to serve.
• Installing and upgrading Apache Hadoop on a cluster.
• Configuring and tuning Hadoop using the various configuration files available
within Hadoop.
• An understanding of all the Hadoop daemons along with their roles and respons-
ibilities in the cluster.
• The administrator should know how to read and interpret Hadoop logs.
• Adding and removing nodes in the cluster.
• Rebalancing nodes in the cluster.
• Employ security using an authentication and authorization system such as Kerber-
os.
• Almost all organizations follow the policy of backing up their data and it is the re-
sponsibility of the administrator to perform this activity. So, an administrator
should be well versed with backups and recovery operations of servers.
Search WWH ::




Custom Search