Designing Real-Time Streaming Architectures - Real-Time Analytics

Database Reference

In-Depth Information

High Availability

The high-availability requirement for the entire system is probably the key

difference between the real-time streaming application and the more

common batch-processing or business-intelligence application. If these

systems become unavailable for minutes or even hours, it is unlikely to

affect operations. (Often, users of the system do not even notice.) Real-time

systems, on the other hand, are sensitive to these sorts of outages and may

even be sensitive to scheduled maintenance windows.

This may not extend to all parts of the stack, such as the delivery

mechanism, but it is usually important for the collection, flow, and

processing systems. To ensure high availability, most of the systems in this

stack resort to two things: distribution and replication. Distribution means

the use of multiple physical servers to distribute the load to multiple end

points. If one of the machines in, say, the collection system, is lost, then the

others can pick up the slack until it can be restored or replaced.

Of course, high availability does not refer to only the availability of the

service; it also refers to the availability of the data being processed. The

collection system generally does not maintain much local state, so it can be

replaced immediately with no attempt to recover. Failures in the data-flow

system, on the other hand, cause some subset of the data itself to become

unavailable. To overcome this problem, most systems employ some form of

replication.

The basic idea behind replication is that, rather than writing a piece of

data to single-machine, a system writes to several machines in the hopes

that at least one of them survives. Relational databases, for example, often

implement replication that allows edits made to a master machine to be

replicatedtoanumberofslavemachineswithvariousguaranteesabouthow

many slaves, if any, must have the data before it is made available to clients

reading from the database. If the master machine becomes unavailable for

somereason,somedatabasescanfailovertooneoftheslavesautomatically,

allowing it to become the master. This failover is usually permanent,

promoting the new master on a permanent basis because the previous

master will be missing interim edits when and if it is restored.

This same approach is also used in some of the software stacks presented

in this topic. The Kafka data motion system uses a master-slave style of

replication to ensure that data written to its queues remains available even

Search WWH ::

Custom Search

Home