Spark Streaming - Learning Spark

Database Reference

In-Depth Information

Stateful transformations require checkpointing to be enabled in your StreamingCon‐

text for fault tolerance. We will discuss checkpointing in more detail in “24/7 Opera‐

tion” on page 205 , but for now, you can enable it by passing a directory to

ssc.checkpoint() , as shown in Example 10-16 .

Example 10-16. Setting up checkpointing

ssc . checkpoint ( "hdfs://..." )

For local development, you can also use a local path (e.g., /tmp ) instead of HDFS.

Windowed transformations

Windowed operations compute results across a longer time period than the Stream‐

ingContext's batch interval, by combining results from multiple batches. In this sec‐

tion, we'll show how to use them to keep track of the most common response codes,

content sizes, and clients in a web server access log.

All windowed operations need two parameters, window duration and sliding dura‐

tion, both of which must be a multiple of the StreamingContext's batch interval. The

window duration controls how many previous batches of data are considered, namely

the last windowDuration / batchInterval . If we had a source DStream with a batch

interval of 10 seconds and wanted to create a sliding window of the last 30 seconds

(or last 3 batches) we would set the windowDuration to 30 seconds. The sliding dura‐

tion, which defaults to the batch interval, controls how frequently the new DStream

computes results. If we had the source DStream with a batch interval of 10 seconds

and wanted to compute our window only on every second batch, we would set our

sliding interval to 20 seconds. Figure 10-6 shows an example.

The simplest window operation we can do on a DStream is window() , which returns

a new DStream with the data for the requested window. In other words, each RDD in

the DStream resulting from window() will contain data from multiple batches, which

we can then process with count() , transform() , and so on. (See Examples 10-17 and

10-18 .)

Search WWH ::

Custom Search

Home