Database Reference
In-Depth Information
>tar xfvz spark-1.2.0-bin-hadoop2.4.tgz
>cd spark-1.2.0-bin-hadoop2.4
Spark places user scripts to run Spark in the bin directory. You can test whether
everything is working correctly by running one of the example programs included in
Spark:
>./bin/run-example org.apache.spark.examples.SparkPi
This will run the example in Spark's local standalone mode. In this mode, all the Spark
processes are run within the same JVM, and Spark uses multiple threads for parallel pro-
cessing. By default, the preceding example uses a number of threads equal to the number
of cores available on your system. Once the program is finished running, you should see
something similar to the following lines near the end of the output:
14/11/27 20:58:47 INFO SparkContext: Job finished: reduce
at SparkPi.scala:35, took 0.723269 s
Pi is roughly 3.1465
To configure the level of parallelism in the local mode, you can pass in a master para-
meter of the local[N] form, where N is the number of threads to use. For example, to
use only two threads, run the following command instead:
>MASTER=local[2] ./bin/run-example
org.apache.spark.examples.SparkPi
Search WWH ::




Custom Search