Integrating DynamoDB with Other AWS Components - Mastering DynamoDB

Database Reference

In-Depth Information

Integrating with AWS EMR

Hadoop and Big Data is one of the most used extract, transform, and load (ETL) tools these

days. Most of the companies are using it to fetch more and more information from the data

available with them. But sometimes it is found that creating and maintaining the Hadoop

cluster is quite a time-consuming job, especially when you don't have much exposure to the

Linux/Unix environment. Also, if you need to use Hadoop in production, you would need

to hire a specialist Hadoop admin, which is an overhead in terms of cost. To solve this,

AWS has introduced a hosted Hadoop as a service where you just need to provide your re-

quirement in terms of cluster configuration (number of data nodes and the size of instances

based on the size of data you want to process), additional services such as Hive, Pig, and so

on, if required, and once done, on a single click of the button, you have your Hadoop

cluster ready.

You can find more details about how to launch Elastic MapReduce EMR cluster and how

emr-what-is-emr.html .

In this section, we will cover the following topics:

• Exporting data from DynamoDB

• Querying and joining tables in DynamoDB using AWS EMR

• Importing data to DynamoDB

Search WWH ::

Custom Search

Home