Database Reference
In-Depth Information
done, you can check the S3 bucket, and you would be able to see the data from Dy-
namoDB stored in flat files.
Formatted data export
If you want to export specific columns from the table, then you can simply mention them
in the SELECT statement. For example, if you want to export only employee ID and
salary, then you can rewrite the insert statement as follows:
INSERT OVERWRITE DIRECTORY 's3://packt-pub-employee/
employee/' SELECT empid, salary
FROM packtPubEmployee;
But make sure you make corresponding changes in the Hive table as well. You can also
export data specifying some formatting in between the columns. Formatting generally
helps when you need to export a table with some delimiters. The following is an example
where we are exporting the same Employee table with tab-delimited columns:
CREATE EXTERNAL TABLE packtPubEmployee (empid String, yoj
String, department String, salary bigint, ,manager String)
STORED BY
'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES ("dynamodb.table.name" = "Employee",
"dynamodb.column.mapping" =
"empid:empId,yoj:yoj,department:dept,salary:salary,manager:manager");
CREATE EXTERNAL TABLE packtPubEmployee_tab_formatted(a_col
string, b_col bigint, c_col array<string>)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://packt-pub-employee/employee/';
INSERT OVERWRITE TABLE packtPubEmployee_tab_formatted
SELECT *
FROM packtPubEmployee;
Here, the only change we need to make is create one staging table with the row format de-
limiter specified.
Search WWH ::




Custom Search