Database Reference
In-Depth Information
Sqoop can export records stored in
SequenceFile
s to an output table too, although
some restrictions apply. A
SequenceFile
cannot contain arbitrary record types.
Sqoop's export tool will read objects from
SequenceFile
s and send them directly to
the
OutputCollector
, which passes the objects to the database export
Out-
putFormat
. To work with Sqoop, the record must be stored in the “value” portion of the
SequenceFile
's key-value pair format and must subclass the
org.apache.sqoop.lib.SqoopRecord
abstract class (as is done by all classes
generated by Sqoop).
If you use the codegen tool (sqoop-codegen) to generate a
SqoopRecord
implementa-
tion for a record based on your export target table, you can write a MapReduce program
that populates instances of this class and writes them to
SequenceFile
s.
sqoop-ex-
port
can then export these
SequenceFile
s to the table. Another means by which data
may be in
SqoopRecord
instances in
SequenceFile
s is if data is imported from a
database table to HDFS and modified in some fashion, and then the results are stored in
SequenceFile
s holding records of the same data type.
In this case, Sqoop should reuse the existing class definition to read data from
SequenceFile
s, rather than generating a new (temporary) record container class to
perform the export, as is done when converting text-based records to database rows. You
can suppress code generation and instead use an existing record class and JAR by provid-
ing the
--class-name
and
--jar-file
arguments to Sqoop. Sqoop will use the spe-
cified class, loaded from the specified JAR, when exporting records.
In the following example, we reimport the
widgets
table as
SequenceFile
s, and
then export it back to the database in a different table:
%
sqoop import --connect jdbc:mysql://localhost/hadoopguide \
>
--table widgets -m 1 --class-name WidgetHolder --as-sequencefile \
>
--target-dir widget_sequence_files --bindir .
...
14/10/29 12:25:03 INFO mapreduce.ImportJobBase: Retrieved 3 records.
%
mysql hadoopguide
mysql>
CREATE TABLE widgets2(id INT, widget_name VARCHAR(100),
->
price DOUBLE, designed DATE, version INT, notes VARCHAR(200));
Query OK, 0 rows affected (0.03 sec)
mysql>
exit;
%
sqoop export --connect jdbc:mysql://localhost/hadoopguide \
>
--table widgets2 -m 1 --class-name WidgetHolder \
>
--jar-file WidgetHolder.jar --export-dir widget_sequence_files