Database Reference
In-Depth Information
The wget command downloads the tarred and compressed connector library file from the web address
http://dev.mysql.com/get/Downloads/Connector-J/ . As soon as the file is downloaded, you unzip and untar it, and
then move it to the correct location so that it can be used by Sqoop:
[root@hc1nn ~]# ls -l mysql-connector-java-5.1.22.tar.gz
-rw-r--r--. 1 root root 4028047 Sep 6 2012 mysql-connector-java-5.1.22.tar.gz
This command shows the downloaded connector library, while the next commands show the file being unzipped
using the gunzip command and unpacked using the tar command with the expand ( x ) and file ( f ) options:
[root@hc1nn ~]# gunzip mysql-connector-java-5.1.22.tar.gz
[root@hc1nn ~]# tar xf mysql-connector-java-5.1.22.tar
[root@hc1nn ~]# ls -lrt
total 9604
drwxr-xr-x. 4 root root 4096 Sep 6 2012 mysql-connector-java-5.1.22
-rw-r--r--. 1 root root 9809920 Sep 6 2012 mysql-connector-java-5.1.22.tar
Now, you copy the connector library to the /usr/lib/sqoop/lib directory so that it is available to Sqoop when it
attempts to connect to a MySQL database:
[root@hc1nn ~]# cp mysql-connector-java-5.1.22/mysql-connector-java-5.1.22-bin.jar /usr/lib/sqoop/lib/
For this example installation, I use the Linux hadoop account. In that user's $HOME/.bashrc Bash shell
configuration file, I have defined some Hadoop and Map Reduce variables, as follows:
#######################################################
# Set up Sqoop variables
# For each user who will be submitting MapReduce jobs using MapReduce v2 (YARN), or running
# Pig, Hive, or Sqoop in a YARN installation, set the HADOOP_MAPRED_HOME
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HADOOP_COMMON_HOME=/usr/lib/hadoop
export HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
export YARN_HOME=/usr/lib/hadoop-yarn/
Use Sqoop to Import Data to HDFS
To import data from a database, you use the Sqoop import statement. For my MySQL database example, I use an
options file containing the connection and access information. Because these details are held in a single file, this
method requires less typing each time the task is repeated. The file that will be used to write table data to HDFS
contains nine lines.
The import line tells Sqoop that data will be imported from the database to HDFS. The -- connect option with a
connect string of jdbc:mysql://hc1nn/sqoop tells Sqoop that JDBC will be used to connect to a MySQL database on
server hc1nn called “sqoop.” I use the Linux cat command to show the contents of the Sqoop options file.
 
Search WWH ::




Custom Search