Database Reference
In-Depth Information
• Download the Spark binaries and set up a development environment that runs in
Spark's standalone local mode. This environment will be used throughout the rest
of the topic to run the example code.
• Explore Spark's programming model and API using Spark's interactive console.
• Write our first Spark program in Scala, Java, and Python.
• Set up a Spark cluster using Amazon's Elastic Cloud Compute ( EC2 ) platform,
which can be used for large-sized data and heavier computational requirements,
rather than running in the local mode.
Tip
Spark can also be run on Amazon's Elastic MapReduce service using custom
bootstrap action scripts, but this is beyond the scope of this topic. The following
article is a good reference guide: http://aws.amazon.com/articles/Elastic-MapRe-
duce/4926593393724923 .
At the time of writing this topic, the article covers running Spark Version 1.1.0.
If you have previous experience in setting up Spark and are familiar with the basics of
writing a Spark program, feel free to skip this chapter.
Search WWH ::




Custom Search