Database Reference
In-Depth Information
export AWS_ACCESS_KEY_ID = "..."
export AWS_SECRET_ACCESS_KEY = "..."
In addition, create an EC2 SSH key pair and download its private key file (usually
called keypair.pem ) so that you can SSH into the machines.
Next, run the launch command of the spark-ec2 script, giving it your key pair name,
private key file, and a name for the cluster. By default, this will launch a cluster with
one master and one slave, using m1.xlarge EC2 instances:
cd /path/to/spark/ec2
./spark-ec2 -k mykeypair -i mykeypair.pem launch mycluster
You can also configure the instance types, number of slaves, EC2 region, and other
factors using options to spark-ec2 . For example:
# Launch a cluster with 5 slaves of type m3.xlarge
./spark-ec2 -k mykeypair -i mykeypair.pem -s 5 -t m3.xlarge launch mycluster
For a full list of options, run spark-ec2 --help . Some of the most common ones are
listed in Table 7-3 .
Table 7-3. Common options to spark-ec2
Name of key pair to use
Private key file (ending in .pem )
Number of slave nodes
Amazon instance type to use
Amazon region to use (e.g., us-west-1 )
Availability zone (e.g., us-west-1b )
Use spot instances at the given spot price (in US dollars)
Once you launch the script, it usually takes about five minutes to launch the
machines, log in to them, and set up Spark.
Logging in to a cluster
You can log in to a cluster by SSHing into its master node with the .pem file for your
keypair. For convenience, spark-ec2 provides a login command for this purpose:
./spark-ec2 -k mykeypair -i mykeypair.pem login mycluster
Search WWH ::

Custom Search