Database Reference
In-Depth Information
With the correct value for the variable set, I start the Spark Master and History servers on the master node:
[root@hc2nn ~]# service spark-master restart
[root@hc2nn ~]# service spark-history-server restart
Finally, I start the Spark workers on all of the data nodes:
[root@hc2r1m1 ~]# service spark-worker restart
That's it; I have just started a basic Spark cluster! I now have the choice of user interfaces to monitor the Spark
cluster. In the configuration file spark-env.sh, the following default variables define the master and worker user
interface ports:
export SPARK_MASTER_WEBUI_PORT=18080
export SPARK_WORKER_WEBUI_PORT=18081
The Spark Master user interface can be found at hc2nn:18080; Figure 9-3 shows its appearance before any
applications are run. Notice the Spark Master URL at the top of the page; that's needed to run applications later.
Figure 9-3. Spark Master server's user interface
The interface also lists the Spark workers and the machines that they are running on, as well as some information
about the state, cores, and memory available to each worker. (In Figure 9-3 , you'll notice that I have also run a worker
on the name node, just to increase the processing capacity in this example.) The area at the bottom of the screen
in Figure 9-3 provides details on running applications, as well as completed applications. In Figure 9-3 , none are
running, so the area is blank.
Uses of Spark
In this section, I use an example to demonstrate the Spark shell, an interactive Spark scripting session, and a Spark
application (provided with the installation) to show you how jobs can be submitted, as well as how they appear in the
Spark user interface. The Spark shell can be used interactively to run ad hoc scripts against Spark cluster-based data.
Running a Spark application, as you will see, allows you to run a job on a Spark cluster by using in-memory processing
in batch mode.
 
Search WWH ::




Custom Search