Industry Needs and Solutions - Microsoft Big Data Solutions

Database Reference

In-Depth Information

fundamental to the operation of Hadoop. Similarly, MapReduce currently

provides both the scheduling and the execution and programming engines

to the whole of Hadoop. Without these two projects there simply is no

Hadoop.

In this next section, we are going to delve a little deeper into these core

Hadoop projects to build up our knowledge of the main building blocks.

Once we've done that, we'll be well placed to move forward with the next

section, which will touch on some of the other projects in the Hadoop

ecosystem.

HDFS

HDFS, one of the core components of Apache Hadoop, stands for Hadoop

Distributed File System. There's no exotic branding to be found here. HDFS

is a Java-based, distributed, fault-tolerant file storage system designed for

distributionacrossanumberofcommodityservers.Theseservershavebeen

configured to operate together as an HDFS cluster . By leveraging a scale-out

model, HDFS ensures that it can support truly massive data volumes at a

low and linear cost point.

Before diving into the details of HDFS, it is worth taking a moment to

discuss the files themselves. Files created in HDFS are made up of a number

of HDFS data blocks or simply HDFS blocks . These blocks are not small.

They are 64MB or more in size, which allows for larger I/O sizes and in turn

greater throughput. Each block is replicated and then distributed across the

machines of the HDFS cluster.

HDFS is built on three core subcomponents:

• NameNode

• DataNode

• Secondary NameNode

Simply put, the NameNode is the “brain.” It is responsible for managing

the file system, and therefore is responsible for allocating directories and

files. The NameNode also manages the blocks , which are present on the

DataNode. There is only one NameNode per HDFS cluster.

The DataNodes are the workers, sometimes known as slaves . The

DataNodesperformthebiddingoftheNameNode.DataNodesexistonevery

machine in the cluster, and they are responsible for offering up the

Search WWH ::

Custom Search

Home