Moving Data - Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Database Reference

In-Depth Information

Make sure that you specify the -m option (as shown above) to perform a sequential import, or you will encounter

an error like the following. If such an error occurs, just correct your sqoop command and try again.

14/07/19 20:29:46 ERROR tool.ImportTool: Error during import: No primary key could be

found for table rawdata. Please specify one with --split-by or perform a sequential

import with '-m 1'.

Another common error you might see is this one:

14/07/19 20:31:19 INFO mapreduce.Job: Task Id : attempt_1405724116293_0001_m_000000_0,

Status: FAILED

Error: java.lang.RuntimeException: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.

CommunicationsException: Communications link failure

This error may mean that MySQL access is not working. Check that you can log in to MySQL on each node and that the

database on the test node (in this case, hc1nn) can be accessed as was tested earlier, in the section “Check the Database.”

By default, the Sqoop import will attempt to install the data in the directory /user/hadoop/rawdata on HDFS.

Before running the import command, though, make sure that the directory does not exist. This is done by using the

HDFS file system -rm option with the -r recursive switch:

[hadoop@hc1nn sqoop]$ hdfs dfs -rm -r /user/hadoop/rawdata

Moved: 'hdfs://hc1nn/user/hadoop/rawdata' to trash at: hdfs://hc1nn/user/hadoop/.Trash/Current

If the directory already exists, you will see an error like this:

14/07/20 11:33:51 ERROR tool.ImportTool: Encountered IOException running import job:

org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory

hdfs://hc1nn/user/hadoop/rawdata already exists

So, to run the Sqoop import job, you use the following command:

[hadoop@hc1nn sqoop]$ sqoop --options-file ./import.txt --table sqoop.rawdata -m 1

The output will then look like this:

Please set $HCAT_HOME to the root of your HCatalog installation.

Please set $ACCUMULO_HOME to the root of your Accumulo installation.

14/07/20 11:35:28 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.7.0

14/07/20 11:35:28 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/07/20 11:35:28 INFO tool.CodeGenTool: Beginning code generation

14/07/20 11:35:29 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `rawdata`

AS t LIMIT 1

14/07/20 11:35:29 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `rawdata`

AS t LIMIT 1

14/07/20 11:35:29 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce

14/07/20 11:35:31 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/

647e8646a006f6e95b0582fca9ccf4ca/rawdata.jar

14/07/20 11:35:31 WARN manager.MySQLManager: It looks like you are importing from mysql.

14/07/20 11:35:31 WARN manager.MySQLManager: This transfer can be faster! Use the --direct

14/07/20 11:35:31 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.

14/07/20 11:35:31 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)

Search WWH ::

Custom Search

Home