Database Reference
In-Depth Information
The addition of replication to Kafka also introduced some changes to the
Kafka Producer API. In versions of Kafka prior to 0.8 there was no
acknowledgement in the Producer API. Applications wrote to the Kafka
socket and hoped for the best. In Kafka 0.8, there are now three different
levels of acknowledgements available: none, leader, and all.
The first option, none, is the same as in Kafka 0.7 and earlier and no
response is returned to producer. This is the least-durable situation and
allows data to be lost, but it affords maximum performance that can be
easily measured into the tens of thousands of messages per second.
The second option, leader, sends an acknowledgement after the leader has
received the message but before it has received acknowledgements from the
ISR. This reduces performance somewhat and can still lead to data loss, but
this option offers a reasonable level of durability for most applications.
The final option, all, sends the acknowledgement only after the leader has
committed the message. In this situation, the data is not lost so long as at
least one partition remains in the ISR. However, the performance reduction
relative to the none case is significant, though much of this can be recovered
with a large number of partitions and a highly concurrent Producer
implementation.
Multiple Datacenter Deployments
Many web applications are latency sensitive, requiring them to be geo-
distributed around the globe. The connections between these far-flung
datacenters are, unsurprisingly, less reliable than connections within a
datacenter. Kafka helps to deal with potential (and depressingly common)
increased latency and complete connection loss between datacenters by
providing built-in mirroring tools. Using these tools, a Kafka cluster is
established in each datacenter with a retention time designed to balance the
need to cover an extended outage, and the available space in the remote
datacenter. If enough space is available, a longer retention time can be used
as a guard against disaster.
Theseremoteclustersarethencopiedintothemainprocessingclusterusing
a tool called MirrorMaker. This tool, which is shipped with Kafka, can read
from multiple remote clusters and writes the messages there into a single
output cluster. Writes to the same topic in each cluster are merged into a
single topic on the output cluster.
Search WWH ::




Custom Search