Cloudware Application Development - Guide to Cloud Computing for Business and Technology Managers

Information Technology Reference

In-Depth Information

Job

configuration

Distributed file system

Master

Mapper

Reducer

Input data

Mapper

Output data

Reducer

Mapper

Data

splits

Intermediate

results

Final

results

FIGURE 17.1

Execution phases in a generic MapReduce application.

For more complex data processing procedures, multiple MapReduce calls

may be linked together in sequence.

The MapReduce programs can be used to compute derived data from

documents such as inverted indexes, and the processing is automatically

parallelized by the system that executes on large clusters of commodity-

type machines, highly scalable to thousands of machines. Since the sys-

tem automatically takes care of details like partitioning the input data,

scheduling and executing tasks across a processing cluster, and manag-

ing the communications between nodes, programmers with no experience

in parallel programming can easily use a large distributed processing

environment.

17.2.1 Google File System (GFS)

Google File System (GFS) is the storage infrastructure that supports the

execution of distributed applications in Google's computing cloud. GFS was

designed to be a high-performance, scalable distributed file system for very

large data files and data-intensive applications providing fault tolerance and

running on clusters of commodity hardware. GFS is oriented to very large

files dividing and storing them in fixed-size chunks of 64 Mb by default,

which are managed by nodes in the cluster called chunkservers. Each GFS

consists of a single master node acting as a nameserver and multiple nodes in

the cluster acting as chunkservers using a commodity Linux-based machine

(node in a cluster) running a user-level server process. Chunks are stored in

plain Linux files, which are extended only as needed and replicated on mul-

tiple nodes to provide high availability and improve performance.

Guide to Cloud Computing for Business and Technology Managers

Search WWH ::

Custom Search

Home