Databases Reference
In-Depth Information
The following example will transfer only those rows whose value in column
id
is greater
than 1:
sqoop import
\
--connect jdbc:mysql://mysql.example.com/sqoop
\
--username sqoop
\
--password sqoop
\
--table visits
\
--incremental append
\
--check-column id
\
--last-value 1
Discussion
Incremental import in
append
mode will allow you to transfer only the newly created
rows. This saves a considerable amount of resources compared with doing a full import
every time you need the data to be in sync. One downside is the need to know the value
of the last imported row so that next time Sqoop can start off where it ended. Sqoop,
when running in incremental mode, always prints out the value of the last imported
row. This allows you to easily pick up where you left off. The following is sample output
printed out when doing incremental import in
append
mode:
13/03/18 08:16:36 INFO tool.ImportTool: Incremental import complete! ...
13/03/18 08:16:36 INFO tool.ImportTool: --incremental append
13/03/18 08:16:36 INFO tool.ImportTool: --check-column id
13/03/18 08:16:36 INFO tool.ImportTool: --last-value 2
Any changed rows that were already imported from previous runs
won't be transmitted again. This method is meant for tables that are
not updating rows.
3.2. Incrementally Importing Mutable Data
Problem
While you would like to use the incremental import feature, the data in your table is
also being updated, ruling out use of the
append
mode.
Solution
Use the
lastmodified
mode instead of the
append
mode. For example, use the following
command to transfer rows whose value in column
last_update_date
is greater than
2013-05-22 01:01:01
:
sqoop import
\
--connect jdbc:mysql://mysql.example.com/sqoop
\
--username sqoop
\