Information Technology Reference
In-Depth Information
Utilization Limit: Each server estimates how many more QPS it can handle and
communicates this to the load balancer. The estimates may be based on current
throughput or data gathered from synthetic load tests.
Latency: The load balancer stops forwarding requests to a backend based on the
latency of recent requests. For example, when requests are taking more than 100
ms, the load balancer assumes this backend is overloaded. This technique manages
bursts of slow requests or pathologically overloaded situations.
Cascade: The first replica receives all requests until it is at capacity. Any overflow
is directed to the next replica, and so on. In this case the load balancer must know
precisely how much traffic each replica can handle, usually by static configuration
based on synthetic load tests.
4.2.3 Load Balancing with Shared State
Anotherissuewithloadbalancingamongmanyreplicasissharedstate.SupposeoneHTTP
request generates some information that is needed by the next HTTP request. A single web
server can store that information locally so that it is available when the second HTTP re-
quest arrives. But what if the load balancer sends the next HTTP request to a different
backend? It doesn't have that information (state). This can cause confusion.
Consider the commonly encountered case in which one HTTP request takes a user's
name and password and validates it, letting the user log in. The server stores the fact that
the user is logged in and reads his or her profile from the database. This is stored locally
for fast access. Future HTTP requests to the same machine know that the user is logged in
and have the user profile on hand, so there's no need to access the database.
What if the load balancer sends the next HTTP request to a different backend? This
backend will not know that the user is logged in and will ask the user to log in again. This
is annoying to the user and creates extra work for the database.
There are a few strategies for dealing with this situation:
Sticky Connections: Load balancers have a feature called stickiness , which
means if a user's previous HTTP request went to a particular backend, the next one
should go there as well. That solves the problem discussed earlier, at least initially.
However, if that backend dies, the load balancer will send requests from that user
to another backend; it has no choice. This new backend will not know the user is
logged in. The user will be asked to log in again. Thus, this is only a partial solu-
tion.
Shared State: In this case the fact that the user has logged in and the user's profile
information are stored somewhere that all backends can access. For each HTTP
Search WWH ::




Custom Search