Database Reference
In-Depth Information
Figure 10-9. Streaming UI tab in the Spark UI
The Streaming UI exposes statistics for our batch processing and our receivers. In our
example we have one network receiver, and we can see the message processing rates.
If we were falling behind, we could see how many records each receiver is able to pro‐
cess. We can also see whether a receiver failed. The batch processing statistics show
us how long our batches take and also break out the delay in scheduling the job. If a
cluster experiences contention, then the scheduling delay may increase.
Performance Considerations
In addition to the existing performance considerations we have discussed in general
Spark, Spark Streaming applications have a few specialized tuning options.
Batch and Window Sizes
The most common question is what minimum batch size Spark Streaming can use. In
general, 500 milliseconds has proven to be a good minimum size for many applica‐
tions. The best approach is to start with a larger batch size (around 10 seconds) and
work your way down to a smaller batch size. If the processing times reported in the
Streaming UI remain consistent, then you can continue to decrease the batch size,
but if they are increasing you may have reached the limit for your application.
 
Search WWH ::




Custom Search