Database Reference
In-Depth Information
Figure 9.4 Viewing job summary stats
The third link is a shortcut to the Hadoop command-line console window
displaying the Hadoop command prompt. Using this console, you can build
and issue map-reduce jobs and issue Hadoop File System (FS)
commands. You can also use this console to administer the Hadoop cluster.
Figure 9.5 shows the Hadoop console being used to list the files in a
directory.
Figure 9.5 Using the Hadoop command-line console
After installing and setting up the environment, you are now ready to
implement an ETL process using Pig. In addition you will use UDFs exposed
by PiggyBank and DataFu for advanced processing.
The four basic steps contained in this activity are:
1. Loading the data.
2. Running Pig interactively with Grunt.
3. Using PiggyBank to extract time periods.
4. Using DataFu to implement some advanced statistical analysis.
 
 
 
Search WWH ::




Custom Search