Database Reference
In-Depth Information
batches, but the batches are so small that the time difference between the
first and last element of the mini-batch is small, usually on the order of
milliseconds, relative to the time scale of the phenomena being monitored
or processed.
The collection components of the system are usually most concerned with
the first definition of low latency. The more connections that can be handled
by a single server, the fewer servers are required to handle the load. More
than one is usually required to meet the high-availability requirements of
the previous section, but being able to reduce the number of edge servers is
usually a good source of cost savings when it comes to real-time systems.
The data flow and processing components are more concerned with the
second definition of latency. Here the problem is minimizing the amount of
time between an event arriving at the process and it being made available to
a consumer. The key here is to avoid as much synchronization as possible,
though it is sometimes unavoidable to maintain high availability. For
example, Kafka added replication between version 0.7, the first publicly
available version, and version 0.8, which is the version discussed in this
topic. It also added a response to the Producer API that informs a client
that a message has indeed reached the host. Prior to this, it was simply
assumed. Unsurprisingly, Kafka's latency increased somewhat between the
two releases as data must now be considered part of an “in sync replica”
before it can be made available to consumers of the data. In most practical
situations, the addition of a second or two of latency is not critical, but the
tradeoff between speed and safety is one often made in real-time streaming
applications.Ifdatacansafelybelost,latencycanbemadeverysmall.Ifnot,
there is an unavoidable lower limit on latency, barring advances in physics.
Horizontal Scalability
Horizontal scaling refers to the practice of adding more physically separate
servers to a cluster of servers handling a specific problem to improve
performance. The opposite is vertical scaling , which is adding more
resources to a single server to improve performance. For many years,
vertical scaling was the only real option without requiring the developer
to resort to exotic hardware solutions, but the arrival of low-cost
high-performance networks as well as software advances have allowed
processes to scale almost linearly. Performance can then be doubled by
essentially doubling the amount of hardware.
Search WWH ::




Custom Search