Scaling MySQL - High Performance MySQL

Databases Reference

In-Depth Information

Most systems provide slightly less than linear scalability at small scaling factors, and

the deviation from linearity becomes more obvious at higher scaling factors. In fact,

most systems eventually reach a point of maximum throughput, beyond which addi-

tional investment provides a negative return—add more workload and you'll actually

reduce the system's throughput! 3

How is this possible? Many models of scalability have been created over the years, with

varying degrees of success and realism. The scalability model that we refer to here is

based on some of the underlying mechanisms that influence systems as they scale. It is

Dr. Neil J. Gunther's Universal Scalability Law (USL). Dr. Gunther has written about

it at length in his topics, including Guerrilla Capacity Planning (Springer). We will not

go deeply into the mathematics here, but if you are interested, his topic and the training

courses offered by his company, Performance Dynamics, might be good resources for

you. 4

The short introduction to the USL is that the deviation from linear scalability can be

modeled by two factors: a portion of the work cannot be done in parallel, and a portion

of the work requires crosstalk. Modeling the first factor results in the well-known

Amdahl's Law, which causes throughput to level off. When part of the task can't be

parallelized, no matter how much you divide and conquer, the task takes at least as

long as the serial portion.

Adding the second factor—intra-node or intra-process communication—to Amdahl's

Law results in the USL. The cost of this communication depends on the number of

communication channels, which grows quadratically with respect to the number of

workers in the system. Thus, the cost eventually grows faster than the benefit, and that's

what is responsible for retrograde scalability. Figure 11-4 illustrates the three concepts

we've talked about so far: linear scaling, Amdahl scaling, and USL scaling. Most real

systems look like the USL curve.

The USL can be applied both to hardware and to software. In the hardware case, the

x-axis represents units of hardware, such as servers or CPUs; the workload, data size,

and query complexity per unit of hardware must be held constant. 5 In the software

case, the x-axis on the plot represents units of concurrency, such as users or threads;

the workload per unit of concurrency must be held constant.

3. In fact, the term “return on investment” can also be considered in light of your financial investment.

Upgrading a component to double its capacity often costs more than twice as much as the initial

investment. Although we often consider this in the real world, we'll omit it from our discussion here to

avoid complicating an already confusing topic.

4. You can also read our white paper, Forecasting MySQL Scalability with the Universal Scalability Law ,

which gives a condensed summary of the mathematics and principles at work in the USL. It is available

at http://www.percona.com .

5. In the real world, it is very difficult to define hardware scalability precisely, because it's hard to actually

hold all those variables constant as you vary the number of servers in the system.

Search WWH ::

Custom Search

Home