Database Reference
In-Depth Information
interface can also be used to monitor the cluster, configuring notification
alerts for health and performance conditions. Job diagnostic information
is also surfaced in the web UI, helping users better understand job
interdependencies, historic performance, and system trends.
Finally, Ambari can integrate with other third-party monitoring
applications via its RESTful API. So when I say it is the system center of
Hadoop, it literally is!
Oozie
Oozie is a Java web scheduling application for Hadoop. Often, a single
job on its own does not define a business process. More often than not,
there is a chain of events, processing, or processes that must be initiated
and completed for the result to have meaning. It is Oozie's lot in life to
providethisfunctionality.Simplyput,Ooziecanbeusedtocomposeasingle
container/unit of work from a collection of jobs, scripts, and programs.
For those familiar with enterprise schedulers, this will be familiar territory.
Oozie takes these units of work and can schedule them accordingly.
It is important to understand that Oozie is a trigger mechanism. It submits
jobs and such, but MapReduce is the executor. Consequently, Oozie must
also solicit status information for actions that it has requested. Therefore,
Oozie has callback and polling mechanisms built in to provide it with job
status/completion information.
Zookeeper
Distributed applications use Zookeeper to help manage and store
configuration information. Zookeeper is interesting because it steps away
from the master/slave model seen in other areas of Hadoop and is itself
a highly distributed architecture and consequently highly available. What
is interesting is that it achieves this while providing a “single view of the
truth” for the configuration information data that it holds. Zookeeper is
responsible for managing and mediating potentially conflicting updates to
this information to ensure synchronized consistency across the cluster. For
those of you who are familiar with managing complex merge replication
topologies, you know that this is no trivial task!
Search WWH ::




Custom Search