Database Reference
In-Depth Information
Figure 19-3. How Spark executors are started in YARN client mode
As each executor starts, it connects back to the
SparkContext
and registers itself. This
gives the
SparkContext
information about the number of executors available for run-
ning tasks and their locations, which is used for making task placement decisions (de-
scribed in
Task Scheduling
).
The number of executors that are launched is set in
spark-shell
,
spark-submit
, or
pyspark
(if not set, it defaults to two), along with the number of cores that each executor uses (the
default is one) and the amount of memory (the default is 1,024 MB). Here's an example
showing how to run
spark-shell
on YARN with four executors, each using one core and 2
GB of memory:
%
spark-shell --master yarn-client \
--num-executors 4 \