Database Reference
In-Depth Information
Exports: A Deeper Look
The Sqoop performs exports is very similar in nature to how Sqoop performs imports (see
Figure 15-4 ). Before performing the export, Sqoop picks a strategy based on the database
connect string. For most systems, Sqoop uses JDBC. Sqoop then generates a Java class
based on the target table definition. This generated class has the ability to parse records
from text files and insert values of the appropriate types into a table (in addition to the abil-
ity to read the columns from a ResultSet ). A MapReduce job is then launched that reads
the source datafiles from HDFS, parses the records using the generated class, and executes
the chosen export strategy.
The JDBC-based export strategy builds up batch INSERT statements that will each add
multiple records to the target table. Inserting many records per statement performs much
better than executing many single-row INSERT statements on most database systems. Se-
parate threads are used to read from HDFS and communicate with the database, to ensure
that I/O operations involving different systems are overlapped as much as possible.
Search WWH ::




Custom Search