Getting Up and Running with Spark - Machine Learning with Spark

Database Reference

In-Depth Information

We can test whether our cluster is correctly set up with Spark by changing into the Spark

directory and running an example in the local mode:

>cd spark

>MASTER=local[2] ./bin/run-example SparkPi

You should see output similar to running the same command on your local computer:

...

14/01/30 20:20:21 INFO SparkContext: Job finished: reduce

at SparkPi.scala:35, took 0.864044012 s

Pi is roughly 3.14032

...

Now that we have an actual cluster with multiple nodes, we can test Spark in the cluster

mode. We can run the same example on the cluster, using our 1 slave node, by passing in

the master URL instead of the local version:

>MASTER=spark://ec2-54-227-127-14.compute-1.amazonaws.com:7077

./bin/run-example SparkPi

Tip

Note that you will need to substitute the preceding master domain name with the correct

domain name for your specific cluster.

Again, the output should be similar to running the example locally; however, the log mes-

sages will show that your driver program has connected to the Spark master:

...

14/01/30 20:26:17 INFO client.Client$ClientActor:

Connecting to master

spark://ec2-54-220-189-136.eu-west-1.compute.amazonaws.com:7077

14/01/30 20:26:17 INFO cluster.SparkDeploySchedulerBackend:

Connected to Spark cluster with app ID

app-20140130202617-0001

14/01/30 20:26:17 INFO client.Client$ClientActor: Executor

added: app-20140130202617-0001/0 on

worker-20140130201049-ip-10-34-137-45.eu-west-1.compute.internal-57119

(ip-10-34-137-45.eu-west-1.compute.internal:57119) with 1

Search WWH ::

Custom Search

Home