Storm High Availability and Failover - Real-time Analytics with Storm and Cassandra - page 111

Database Reference

In-Depth Information

The Storm isolation scheduler

The Storm isolation scheduler was released in Storm Version 0.8.2. This was a very handy

feature that is very actively being used ever since its release, in the case of the shared

Storm cluster. Let's understand its working and capability through an example; say, we

have a four supervisor node Storm cluster with four slots each, so in total I have 16 slots.

Now I want to employ three Storm topologies here, say, Topo1, Topo2, and Topo3; each

has four workers allocated to it.

So by probable default, the scheduling behavior of the Storm distribution will be as fol-

lows:

Supervisor 1

Supervisor 2

Supervisor 3

Supervisor 4

Topo1 Worker 1

Worker 2

Worker 3

Worker 4

Topo2 Worker 2

Worker 1

Worker 1

Worker 1

Topo3 Worker 3

Worker 3

Worker 2

Worker 2

Storm will respect load distribution and will spawn one worker of each topology on each

node.

Now let's tweak the scenario a bit and introduce a requirement that Topo1 is a very

resource-intensive topology. (I want to dedicate one supervisor entirely to this one so that I

save on network hops.) This could be attained by the use of the isolation scheduler.

We will have to make the following entry in the storm.yaml file of each Storm node in

the cluster (Nimbus and supervisor):

isolation.scheduler.machines:

"Topol": 2

The cluster is required to be restarted for this setting to take effect. This setting means that

we have dedicated two supervisor nodes to Topo1 and it will be no longer be shared with

other topologies being submitted to the cluster. This will also ensure a viable solution to

multitenancy problems encountered in production.

Next Page

Real-time Analytics with Storm and Cassandra

Search WWH ::

Custom Search

Home