Database Reference
In-Depth Information
The first argument of the command, --connect , determines what type of
driver you will use for connecting to the relational database. In this case, the
command is specifying that Sqoop will use the SQL Server JDBC driver to
connect to the database.
NOTE
When specifying the connection to the database, you should use the
server name or IP address. Do not use localhost, because this
connection string will be sent to all the cluster nodes involved in the
job, and they will attempt to make their own connections. Because
localhost refers to the local computer, each node will attempt to
connect to the database as if it exists on that node, which will likely fail.
You may notice that the -- connect argument contains the full connection
string for the database. Ideally, you will use Windows Authentication in the
connection string so that the password doesn't have to be specified. You can
also use the --password-file argument to tell Sqoop to use a file that
stores the password, instead of entering it as part of the command.
The --table argument tells Sqoop which table you intend to import from
the specified database. This is the table that Sqoop will derive its metadata
from. By default, all columns within the table are imported. You can limit
the column list by using the --columns argument:
--columns "FirstName,LastName,City,State,PostalCode"
You can also filter the rows returned by Sqoop by using the --where
argument, which enables you to specify a where clause for the query:
--where "State='FL'"
If you need to execute a more complex query, you can replace the --table ,
--columns , and --where arguments with a --query argument. This lets
you specify an arbitrary SELECT statement, but some constraints apply.
The SELECT statement must be relatively straightforward; nested tables
and common table expressions can cause problems. Because Sqoop needs
Search WWH ::




Custom Search