Databases Reference
In-Depth Information
ALTER USER sqoop SET search_path = public , us ;
7.3. Exporting into PostgreSQL Using pg_bulkload
Problem
You are using the pg_bulkload utility to load data to your PostgreSQL server. Since
Sqoop utilizes mysqlimport for MySQL, can Sqoop also utilize pg_bulkload for
PostgreSQL?
Solution
Sqoop offers a specialized connector for PostgreSQL that takes advantage of the
pg_bulkload utility. You can use the following Sqoop command to make use of this
connector:
sqoop import \
--connect jdbc:postgresql://postgresql.example.com/database \
--username sqoop \
--password sqoop \
--connection-manager org.apache.sqoop.manager.PGBulkloadManager \
--table cities
Discussion
pg_bulkload is a third-party utility not distributed with PostgreSQL. You need to man‐
ually download and install it. It allows a user to load data into a PostgreSQL server at a
high speed by bypassing the write-ahead log and shared buffers. Using the pg_bulk
load utility with Sqoop is very simple, as Sqoop has built-in support for it.
As with other direct connectors, you need to have the pg_bulkload utility available on
all nodes in your Hadoop cluster because Sqoop's tasks can be executed on any Task‐
Tracker node. You can specify the path to the utility with the pgbulkload.bin property.
For example, if you installed the utility in /usr/local/bin/pg_bulkload , you can use
the following Sqoop command:
sqoop import \
-Dpgbulkload.bin = /usr/local/bin/pg_bulkload \
--connect jdbc:postgresql://postgresql.example.com/database \
--username sqoop \
--password sqoop \
--connection-manager org.apache.sqoop.manager.PGBulkloadManager \
--table cities
See Also
More information about mysqlimport for MySQL is in Recipe 2.7 .
Search WWH ::




Custom Search