Database Reference
In-Depth Information
Flag
Explanation
The amount of memory to use for executors, in bytes. Suffixes can be used to specify larger quantities
such as “512m” (512 megabytes) or “15g” (15 gigabytes).
--executor-
memory
The amount of memory to use for the driver process, in bytes. Suffixes can be used to specify larger
quantities such as “512m” (512 megabytes) or “15g” (15 gigabytes).
--driver-
memory
spark-submit
also allows setting arbitrary SparkConf configuration options using
either the
--conf prop=value
flag or providing a properties file through
--
properties-file
that contains key/value pairs.
Chapter 8
will discuss Spark's config‐
uration system.
Example 7-4
shows a few longer-form invocations of
spark-submit
using various
options.
Example 7-4. Using spark-submit with various options
# Submitting a Java application to Standalone cluster mode
$
./bin/spark-submit
\
--master spark://hostname:7077
\
--deploy-mode cluster
\
--class com.databricks.examples.SparkExample
\
--name
"Example Program"
\
--jars dep1.jar,dep2.jar,dep3.jar
\
--total-executor-cores
300
\
--executor-memory 10g
\
myApp.jar
"options"
"to your application"
"go here"
# Submitting a Python application in YARN client mode
$
export
HADOP_CONF_DIR
=
/opt/hadoop/conf
$
./bin/spark-submit
\
--master yarn
\
--py-files somelib-1.2.egg,otherlib-4.4.zip,other-file.py
\
--deploy-mode client
\
--name
"Example Program"
\
--queue exampleQueue
\
--num-executors
40
\
--executor-memory 10g
\
my_script.py
"options"
"to your application"
"go here"
Packaging Your Code and Dependencies
Throughout most of this topic we've provided example programs that are self-
contained and had no library dependencies outside of Spark. More often, user pro‐
grams depend on third-party libraries. If your program imports any libraries that are
not in the
org.apache.spark
package or part of the language library, you need to