Database Reference
In-Depth Information
JobHistoryServer This is a service that serves historical information about completed jobs.
JobHistoryServer can be embedded within the JobTracker process. If you have an extremely busy
cluster, it is recommended that you run this as a separate service. This can be done by setting
the mapreduce.history.server.embedded property to true in the mapred-site.xml file .
Running this service consumes considerable disk space because it saves job history information
for all the jobs.
in hadoop versions 2.0 and beyond, Mapreduce will be replaced by YARN or Mapreduce 2.0 (also known
as Mrv2). Yarn is a subproject of hadoop at the apache software Foundation that was introduced in hadoop 2.0.
it separates the resource-management and processing components. it provides a more generalized processing platform
that is not restricted to just Mapreduce.
Note
Configuration Files
There are two key configuration files that have the various parameters for MapReduce jobs. These files are located in
the path C:\apps\dist\hadoop-1.2.0.1.3.1.0-06\conf\ of the NameNode:
core-site.xml
mapred-site.xml
core-site.xml
This file contains configuration settings for Hadoop Core, such as I/O settings that are common to Windows Azure
Storage Blob (WASB) and MapReduce. It is used by all Hadoop services and clients because all services need to
know how to locate the NameNode. There will be a copy of this file in each node running a Hadoop service. This file
has several key elements of interest—particularly because the storage infrastructure has moved to WASB instead
of being in Hadoop Distributed File System (HDFS), which used to be local to the data nodes. For example, in your
democluster , you should see entries in your core-site.xml file similar to Listing 13-1.
Listing 13-1. WASB detail
<property>
<name>fs.default.name</name>
<!-- cluster variant -->
<value>wasb://democlustercontainer@democluster.blob.core.windows.net
</value>
<description>The name of the default file system. Either the
literal string "local" or a host:port for NDFS.
</description>
<final>true</final>
</property>
If there is an issue with accessing your storage that is causing your jobs to fail, the core-site.xml file is the first
place where you should confirm that your cluster is pointing toward the correct storage account and container.
The core-site.xml file also has an attribute for the storage key, as shown in Listing 13-2. If you are encountering
502/403 - Forbidden/Authentication errors while accessing your storage, you must make sure that the proper storage
account key is provided.
 
 
Search WWH ::




Custom Search