Databases Reference
In-Depth Information
additional drawback that you need to keep in mind when using large numbers of rows
inserted with a single insert statement: most databases have limits on the maximum
query size. The Sqoop export will fail if the remote database server does not accept the
generated query.
The third batching mechanism does not try to achieve better performance by putting
multiple rows together as the previous two options did. The value specified in sqoop.ex
port.statements.per.transaction determines how many insert statements will be
issued on the database prior to committing the transaction and starting a new one.
Higher values of this property lead to longer-lived transactions and remove the overhead
introduced by creating and finishing the transaction. Using higher values usually helps
to improve performance. However, the exact behavior depends on the underlying da‐
tabase and its functionality. If your database requires a special table-level write lock for
inserting rows into a table, using a higher value for statements per transaction might
lead to significantly decreased performance.
As each method uses a different means for improving the export performance, you can
combine all of them together. Each database system and user environment is different.
There aren't best practices that can be broadly applied across all use cases. Our recom‐
mendation is to start with enabling --batch import and specify the number of rows per
statement to roughly equal the maximum allowed query size. From that starting point,
experiment with different values.
5.3. Exporting with All-or-Nothing Semantics
Problem
You need to ensure that Sqoop will either export all data from Hadoop to your database
or export no data (i.e., the target table will remain empty).
Solution
You can use a staging table to first load data to a temporary table before making changes
to the real table. The staging table name is specified via the --staging-table parameter.
In the below example, we set it to staging_cities :
sqoop export \
--connect jdbc:mysql://mysql.example.com/sqoop \
--username sqoop \
--password sqoop \
--table cities \
--staging-table staging_cities
Search WWH ::




Custom Search