Sqoop - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

Performing an Export

In Sqoop, an import refers to the movement of data from a database system into HDFS. By

contrast, an export uses HDFS as the source of data and a remote database as the destina-

tion. In the previous sections, we imported some data and then performed some analysis us-

ing Hive. We can export the results of this analysis to a database for consumption by other

tools.

Before exporting a table from HDFS to a database, we must prepare the database to receive

the data by creating the target table. Although Sqoop can infer which Java types are appro-

priate to hold SQL data types, this translation does not work in both directions (for ex-

ample, there are several possible SQL column definitions that can hold data in a Java

String ; this could be CHAR(64) , VARCHAR(200) , or something else entirely). Conse-

quently, you must determine which types are most appropriate.

We are going to export the zip_profits table from Hive. We need to create a table in

MySQL that has target columns in the same order, with the appropriate SQL types:

% mysql hadoopguide

mysql> CREATE TABLE sales_by_zip (volume DECIMAL(8,2), zip INTEGER);

Query OK, 0 rows affected (0.01 sec)

Then we run the export command:

% sqoop export --connect jdbc:mysql://localhost/hadoopguide -m 1 \

> --table sales_by_zip --export-dir /user/hive/warehouse/zip_profits \

> --input-fields-terminated-by '\0001'

...

14/10/29 12:05:08 INFO mapreduce.ExportJobBase: Transferred 176 bytes

in 13.5373

seconds (13.0011 bytes/sec)

14/10/29 12:05:08 INFO mapreduce.ExportJobBase: Exported 3 records.

Finally, we can verify that the export worked by checking MySQL:

% mysql hadoopguide -e 'SELECT * FROM sales_by_zip'

+--------+-------+

| volume | zip |

+--------+-------+

| 28.00 | 10005 |

| 403.71 | 90210 |

| 20.00 | 95014 |

+--------+-------+

Search WWH ::

Custom Search

Home