Database Reference
In-Depth Information
your environment. One place where Spark provides more support is the Standalone
cluster manager, which supports a --supervise flag when submitting your driver
that lets Spark restart it. You will also need to pass --deploy-mode cluster to make
the driver run within the cluster and not on your local machine, as shown in
Example 10-45 .
Example 10-45. Launching a driver in supervise mode
./bin/spark-submit --deploy-mode cluster --supervise --master spark://... App.jar
When using this option, you will also want the Spark Standalone master to be fault-
tolerant. You can configure this using ZooKeeper, as described in the Spark docu‐
mentation . With this setup, your application will have no single point of failure.
Finally, note that when the driver crashes, executors in Spark will also restart. This
may be changed in future Spark versions, but it is expected behavior in 1.2 and earlier
versions, as the executors are not able to continue processing data without a driver.
Your relaunched driver will start new executors to pick up where it left off.
Worker Fault Tolerance
For failure of a worker node, Spark Streaming uses the same techniques as Spark for
its fault tolerance. All the data received from external sources is replicated among the
Spark workers. All RDDs created through transformations of this replicated input
data are tolerant to failure of a worker node, as the RDD lineage allows the system to
recompute the lost data all the way from the surviving replica of the input data.
Receiver Fault Tolerance
The fault tolerance of the workers running the receivers is another important consid‐
eration. In such a failure, Spark Streaming restarts the failed receivers on other nodes
in the cluster. However, whether it loses any of the received data depends on the
nature of the source (whether the source can resend data or not) and the implemen‐
tation of the receiver (whether it updates the source about received data or not). For
example, with Flume, one of the main differences between the two receivers is the
data loss guarantees. With the receiver-pull-from-sink model, Spark removes the ele‐
ments only once they have been replicated inside Spark. For the push-to-receiver
model, if the receiver fails before the data is replicated some data can be lost. In gen‐
eral, for any receiver, you must also consider the fault-tolerance properties of the
upstream source (transactional, or not) for ensuring zero data loss.
In general, receivers provide the following guarantees:
• All data read from a reliable filesystem (e.g., with StreamingContext.hadoop
Files ) is reliable, because the underlying filesystem is replicated. Spark
Search WWH ::




Custom Search