Database Reference
In-Depth Information
When the cluster status is WAITING , it means the master node is ready for connection.
By connecting SSH to the master node, you can perform a CLI operation on the master
node.
Step 4 - Setting up the Hive table to run Hive interactive commands
Hive is a data warehousing application, and you can leverage it to use query data in the
Amazon EMR clusters using the HiveQL language. To run Hive commands, follow the
given steps:
1. On the Hadoop prompt at the master node, type hive .
2. You will see a Hive prompt as follows:
hive>
3. Use the Hive command, which will map a table in the Hive application to the data
of DynamoDB. That table will act as a reference entity for the data stored in Dy-
namoDB. The data won't be stored locally on Hive. Check the following com-
mand for your reference:
CREATE EXTERNAL TABLE hiveuchit (col1 string, col2
bigint, col3 array<string>)
STORED BY
'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'
TBLPROPERTIES ("dynamodb.table.name" =
"dynamodbuchit",
"dynamodb.column.mapping" =
"col1:name,col2:year,col3:holidays");
While you are running Hive queries on the DynamoDB table, you have to ensure that you
have provided enough read capacity units.
Now you have successfully completed the set up and configuration of EMR with Dy-
namoDB. We will now see some advanced Hive commands to perform operations such as
exporting, importing, querying, and joining.
Search WWH ::




Custom Search