Databases Reference
In-Depth Information
Note the absence of whitespace with the --columns parameter.
Discussion
By default, Sqoop assumes that your HDFS data contains the same number and ordering
of columns as the table you're exporting into. The parameter --columns is used to specify
either a reordering of columns or that only a subset of table columns is available in the
input files. The parameter accepts a comma-separated list of column names and can be
particularly helpful if you're exporting data to different tables or your table has changed
between the import and export operations.
There is a limitation to keep in mind when using the --columns parameter while ex‐
porting only to a subset of table columns. As Sqoop uses INSERT statements to transfer
data from Hadoop, the database must allow inserting new rows with only specified
columns.
Columns that are not being exported must either allow NULL values or
contain a default value that your DB engine could use.
5.8. Encoding the NULL Value Differently
Problem
Your Hadoop processing uses custom string constants to encode missing values, and
you need Sqoop to properly use them rather than insisting on the default null .
Solution
You can override the NULL substitution characters by setting the --input-null-
string and --input-null-non-string parameters to any value. For example, use the
following command to override it to \N :
sqoop export \
--connect jdbc:mysql://mysql.example.com/sqoop \
--username sqoop \
--password sqoop \
--table cities \
--input-null-string '\\N' \
--input-null-non-string '\\N'
Search WWH ::




Custom Search