Database Reference
In-Depth Information
Summary
While introducing the challenges and benefits of big data, this chapter also presents a set of requirements for big data
systems and explains how they can be met by utilizing the tools discussed in the remaining chapters of this topic.
The aim of this topic has been to explain the building of a big data processing system by using the Hadoop tool
set. Examples are used to explain the functionality provided by each Hadoop tool. Starting with HDFS for storage,
followed by Nutch and Solr for data capture, each chapter covers a new area of functionality, providing a simple
overview of storage, processing, and scheduling. With these examples and the step-by-step approach, you can build
your knowledge of big data possibilities and grow your familiarity with these tools. By the end of Chapter 11, you will
have learned about most of the major functional areas of a big data system.
As you read through this topic, you should consider how to use the individual Hadoop components in your own
systems. You will also notice a trend toward easier methods of system management and development. For instance,
Chapter 2 starts with a manual installation of Hadoop, while Chapter 8 uses cluster managers. Chapter 4 shows
handcrafted code for Map Reduce programming, but Chapter 10 introduces visual object based Map Reduce task
development using Talend and Pentaho.
Now it's time to start, and we begin by looking at Hadoop itself. The next chapter introduces the Hadoop
application and its uses, and shows how to configure and use it.
 
Search WWH ::




Custom Search