Database Reference
In-Depth Information
to become free to “ramp up” again when the user types a new command. Note, how‐
ever, that you can use a mix of scheduling modes in the same Mesos cluster (i.e.,
some of your Spark applications might have spark.mesos.coarse set to true and
some might not).
Client and cluster mode
As of Spark 1.2, Spark on Mesos supports running applications only in the “client”
deploy mode—that is, with the driver running on the machine that submitted the
application. If you would like to run your driver in the Mesos cluster as well, frame‐
works like Aurora and Chronos allow you to submit arbitrary scripts to run on Mesos
and monitor them. You can use one of these to launch the driver for your
application.
Configuring resource usage
You can control resource usage on Mesos through two parameters to spark-submit :
--executor-memory , to set the memory for each executor, and --total-executor-
cores , to set the maximum number of CPU cores for the application to claim (across
all executors). By default, Spark will launch each executor with as many cores as pos‐
sible, consolidating the application to the smallest number of executors that give it
the desired number of cores. If you do not set --total-executor-cores , it will try to
use all available cores in the cluster.
Amazon EC2
Spark comes with a built-in script to launch clusters on Amazon EC2. This script
launches a set of nodes and then installs the Standalone cluster manager on them, so
once the cluster is up, you can use it according to the Standalone mode instructions
in the previous section . In addition, the EC2 script sets up supporting services such as
HDFS, Tachyon, and Ganglia to monitor your cluster.
The Spark EC2 script is called spark-ec2 , and is located in the ec2 folder of your
Spark installation. It requires Python 2.6 or higher. You can download Spark and run
the EC2 script without compiling Spark beforehand.
The EC2 script can manage multiple named clusters , identifying them using EC2
security groups. For each cluster, the script will create a security group called
clustername-master for the master node, and clustername-slaves for the workers.
Launching a cluster
To launch a cluster, you should first create an Amazon Web Services (AWS) account
and obtain an access key ID and secret access key. Then export these as environment
variables:
Search WWH ::




Custom Search