ETL with Hadoop - Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Database Reference

In-Depth Information

Clicking the Launch button executes the job and produces the basic-level log output that is shown in Figure 10-23 .

Figure 10-23. Results of job run

I have also monitored this job via my Hadoop Resource Manager interface on the URL http://hc2nn.semtech-

solutions.co.nz:8088/cluster/apps . This URL allows me to watch the job's progress until it is finished and monitor

log files, if necessary. As I know that the job has finished, there must be an existing part file under the results directory

that contains the results data. To see that output, I run this command from the Linux hadoop account:

[hadoop@hc2nn ~]$ hdfs dfs -cat /data/pentaho/result/part-00000 | head -10

ACURA-1.6 EL 2

ACURA-1.6EL 6

ACURA-1.7EL 12

ACURA-2.2CL 2

ACURA-2.3 CL 2

ACURA-2.3CL 2

ACURA-2.5TL 3

ACURA-3.0 CL 1

ACURA-3.0CL 2

ACURA-3.2 TL 1

I use the Hadoop file system cat command to dump the contents of the HDFS-based results part file, and then

the Linux head command to limit the output to the first 10 rows. What I see, then, is a summed list of vehicle makes

and models.

PDI's visual interface makes it possible for even inexperienced Hadoop users to create and schedule Map Reduce

jobs. You don't need to know Map Reduce programming and can work on client development machines. Simply by

selecting graphical functional icons, plugging them together, and configuring them, you can create complex ETL chains.

Potential Errors

Nothing in life goes perfectly, so let's addresses some errors you may encounter during a job creation.

For instance, while working on the example just given, I discovered that a MySQL connector jar file had not been

installed into the PDI library directory when I tried to connect PDI to MySQL. I received the following error message:

Driver class 'org.gjt.mm.mysql.Driver' could not be found, make sure the 'MySQL' driver (jar file)

is installed.

Search WWH ::

Custom Search

Home