Databases Reference
In-Depth Information
Solution
This issue is quite often seen when the data contains characters that are used as Hive's
delimiters. You can instruct Sqoop to automatically clean your data using --hive-drop-
import-delims , which will remove all \n , \t , and \01 characters from all string-based
columns:
sqoop import \
--connect jdbc:mysql://mysql.example.com/sqoop \
--username sqoop \
--password sqoop \
--table cities \
--hive-import \
--hive-drop-import-delims
If removing the special characters is not an option in your use case, you can take ad‐
vantage of the parameter --hive-delims-replacement , which will accept a replace‐
ment string. Instead of removing separators completely, they will be replaced with a
specified string. The following example will replace all \n , \t , and \01 characters with
the string SPECIAL :
sqoop import \
--connect jdbc:mysql://mysql.example.com/sqoop \
--username sqoop \
--password sqoop \
--table cities \
--hive-import \
--hive-delims-replacement "SPECIAL"
Discussion
Sqoop will, by default, import data into comma-separated text files where each line
represents one row. However, if your data contains the new-line character ( \n ), such a
row will create two separate lines that will consequently be processed as two separate
rows by Hive. Consequently, Hive will show a higher row count than your source table.
Other default parameters like \t and \01 might also cause parsing issues; however, the
new-line character is the most common issue. You can instruct Sqoop to clean up your
data either with --hive-drop-import-delims or --hive-delims-replacement
parameters.
Even though both parameters contain hive in their names, they are not restricted to
working in tandem with the --hive-import parameter. They can be used in any import
job using text files to ensure that the output files have one line per imported row. Also,
as they target the default delimiters, using them with custom delimiters is not recom‐
mended, as they will always remove or substitute only the default delimiters.
Search WWH ::




Custom Search