Database Reference
In-Depth Information
Exercises
1. Research and document additional use cases and actual implementations
for Hadoop.
2. Compare and contrast Hadoop, Pig, Hive, and HBase. List strengths and
weaknesses of each tool set. Research and summarize three published use
cases for each tool set. Exercises 3 through 5 require some programming
background and a working Hadoop environment. The text of the novel War
and Peace can be downloaded from http://onlinebooks
.library.upenn.edu/ and used as the dataset for these exercises.
However, other datasets can easily be substituted. Document all processing
steps applied to the data.
3. Use MapReduce in Hadoop to perform a word count on the specified
dataset.
4. Use Pig to perform a word count on the specified dataset.
5. Use Hive to perform a word count on the specified dataset.
Search WWH ::




Custom Search