• DataNodes are the salves with the following functions:
• Blocks storage using the underlying OS's files
• Access to the blocks is given to the clients directly from DataNodes
• Communication of status and health to the NameNode periodically
• Checks for block integrity periodically
Hadoop MapReduce function and flow is depicted in the following figure. There are a
series of functions that are executed within the MapReduce flow Mapper | Combin-
er | Partitioner | Shuffle and Sort | Reducer . Few of the functions in the flow can
be implicit (have a default behavior, if not coded for).
Hadoop enables the data scientists to create MapReduce jobs quickly and efficiently.
The screenshot below shows Greenplum Command Center with database and HD