Database Reference
In-Depth Information
High Availability
The high-availability requirement for the entire system is probably the key
difference between the real-time streaming application and the more
common batch-processing or business-intelligence application. If these
systems become unavailable for minutes or even hours, it is unlikely to
affect operations. (Often, users of the system do not even notice.) Real-time
systems, on the other hand, are sensitive to these sorts of outages and may
even be sensitive to scheduled maintenance windows.
This may not extend to all parts of the stack, such as the delivery
mechanism, but it is usually important for the collection, flow, and
processing systems. To ensure high availability, most of the systems in this
stack resort to two things: distribution and replication. Distribution means
the use of multiple physical servers to distribute the load to multiple end
points. If one of the machines in, say, the collection system, is lost, then the
others can pick up the slack until it can be restored or replaced.
Of course, high availability does not refer to only the availability of the
service; it also refers to the availability of the data being processed. The
collection system generally does not maintain much local state, so it can be
replaced immediately with no attempt to recover. Failures in the data-flow
system, on the other hand, cause some subset of the data itself to become
unavailable. To overcome this problem, most systems employ some form of
replication.
The basic idea behind replication is that, rather than writing a piece of
data to single-machine, a system writes to several machines in the hopes
that at least one of them survives. Relational databases, for example, often
implement replication that allows edits made to a master machine to be
replicatedtoanumberofslavemachineswithvariousguaranteesabouthow
many slaves, if any, must have the data before it is made available to clients
reading from the database. If the master machine becomes unavailable for
somereason,somedatabasescanfailovertooneoftheslavesautomatically,
allowing it to become the master. This failover is usually permanent,
promoting the new master on a permanent basis because the previous
master will be missing interim edits when and if it is restored.
This same approach is also used in some of the software stacks presented
in this topic. The Kafka data motion system uses a master-slave style of
replication to ensure that data written to its queues remains available even
Search WWH ::




Custom Search