Databases Reference
In-Depth Information
--table cities \
--map-column-java id = Long
Discussion
The parameter --map-column-java accepts a comma separated list where each item is
a key-value pair separated by an equal sign. The exact column name is used as the key,
and the target Java type is specified as the value. For example, if you need to change
mapping in three columns c1 , c2 , and c3 to Float , String , and String , respectively,
then your Sqoop command line would contain the following fragment:
sqoop import --map-column-java c1 = Float,c2 = String,c3 = String ...
An example of where this parameter is handy is when your MySQL table has a primary
key column that is defined as unsigned int with values that are bigger than 2 147 483
647. In this particular scenario, MySQL reports that the column has type integer , even
though the real type is unsigned integer . The maximum value for an unsigned inte
ger column in MySQL is 4 294 967 295. Because the reported type is integer , Sqoop
will use Java's Integer object, which is not able to contain values larger than 2 147 483
647. In this case, you have to manually provide hints to do more appropriate type map‐
ping.
Use of this parameter is not limited to overcoming MySQL's unsigned types problem.
It is further applicable to many use cases where Sqoop's default type mapping is not a
good fit for your environment. Sqoop fetches all metadata from database structures
without touching the stored data, so any extra knowledge about the data itself must be
provided separately if you want to take advantage of it. For example, if you're using BLOB
or BINARY columns for storing textual data to avoid any encoding issues, you can use
the --column-map-java parameter to override the default mapping and import your
data as String .
2.9. Controlling Parallelism
Problem
Sqoop by default uses four concurrent map tasks to transfer data to Hadoop. Transfer‐
ring bigger tables with more concurrent tasks should decrease the time required to
transfer all data. You want the flexibility to change the number of map tasks used on a
per-job basis.
Search WWH ::




Custom Search