Database Reference
In-Depth Information
Getting Sqoop
Sqoop is available in a few places. The primary home of the project is the Apache Software
Foundation . This repository contains all the Sqoop source code and documentation. Official
releases are available at this site, as well as the source code for the version currently under
development. The repository itself contains instructions for compiling the project. Alternat-
ively, you can get Sqoop from a Hadoop vendor distribution.
If you download a release from Apache, it will be placed in a directory such as
/home/ yourname /sqoop- x.y.z / . We'll call this directory $SQOOP_HOME . You can run
Sqoop by running the executable script $SQOOP_HOME/bin/sqoop .
If you've installed a release from a vendor, the package will have placed Sqoop's scripts in
a standard location such as /usr/bin/sqoop . You can run Sqoop by simply typing sqoop at
the command line. (Regardless of how you install Sqoop, we'll refer to this script as just
sqoop from here on.)
SQOOP 2
Sqoop 2 is a rewrite of Sqoop that addresses the architectural limitations of Sqoop 1. For example, Sqoop
1 is a command-line tool and does not provide a Java API, so it's difficult to embed it in other programs.
Also, in Sqoop 1 every connector has to know about every output format, so it is a lot of work to write
new connectors. Sqoop 2 has a server component that runs jobs, as well as a range of clients: a command-
line interface (CLI), a web UI, a REST API, and a Java API. Sqoop 2 also will be able to use alternative
execution engines, such as Spark. Note that Sqoop 2's CLI is not compatible with Sqoop 1's CLI.
The Sqoop 1 release series is the current stable release series, and is what is used in this chapter. Sqoop 2
is under active development but does not yet have feature parity with Sqoop 1, so you should check that it
can support your use case before using it in production.
Running Sqoop with no arguments does not do much of interest:
% sqoop
Try sqoop help for usage.
Sqoop is organized as a set of tools or commands. If you don't select a tool, Sqoop does not
know what to do. help is the name of one such tool; it can print out the list of available
tools, like this:
% sqoop help
usage: sqoop COMMAND [ARGS]
Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
Search WWH ::




Custom Search