Database Reference
In-Depth Information
How to Measure Scaling
When the application is configured to run in a clustered configuration, the throughput, or global throughput, of an
n -node clustered configuration could be measured using
T ( n ) = SUM t ( i ),
where i = 1, …, n and t ( i ) is the throughput measured on one node in the clustered configuration.
Using the preceding formula, as we increase the number of nodes in the cluster, the value of n changes and so will
the value of T . This will help in defining a throughput curve for the application configured to run on an n -node cluster.
Although computing the overall throughput of the application on an n -node cluster, the formula does not
consider intangible factors such as performance of the servers, resource availability, network bandwidth, and so forth.
Other factors that could hinder, improve, or contribute to the performance of the system must be considered. Ideally,
a cluster should have all nodes with identical configuration for easy manageability and administration. However, if
this is not the case, factors such as power of CPU, memory, and so forth should also be included in the computation.
Adding these factors to the preceding formula would result in the following:
T ( n ) = n × T × S ( n ),
where T ( n ) is the global throughput of the application running on n nodes and is measured by units of time; t , as
we indicated previously, is the throughput for one node in the cluster; n is the number of nodes participating in the
clustered configuration; and S ( n ) is a coefficient that determines overall cluster throughput.
After considering the power and individual server details, factors outside the servers such as the network delays,
network transfer delays, I/O latency of the storage array, and so forth should also be added to the formula. Although
the previous measurements included factors that provide additional resources, this step would show any negative
impact or overhead in the overall performance of the cluster.
Factors such as type of clustered hardware, topology, type of applications running on the clustered configuration,
and so forth affect the scalability of the cluster and should also be considered as part of the equation. For example,
massively parallel processing (MPP) architecture works well for a data warehouse implementation; however, for an OLTP
implementation, a clustered SMP architecture would be better suited. With these factors added, the new formula would be
T ( n ) = nts ( c , n,a,k ),
where c is the type of clustered hardware, n is the node number, a is the type of application running on the clustered
configuration, and k is the topology of the cluster.
Because all of these additional factors cannot be easily measured, only the previous formulas are used in the analysis.
The best-case scenario of scalability of an application would be when the application scales up with a constant
factor of one, providing a consistent linear growth. However, this kind of scalability is not realistic.
Rather, it is typical for applications to have sub linear growth with the increase in nodes. The growth continues
until a certain limit has been reached, after which adding additional nodes to the clustered configuration would not
show any further advantage. This is demonstrated by the graph in Figure 2-1 , which indicates that there is a linear
scale-up with the addition of new nodes; however, after a certain percentage of scalability has been reached, a point of
no return is reached on investment, and the scalability reduces with the addition of more nodes.
Capacity planning for an enterprise system takes many iterative steps. Every time there is a change in usage
pattern, capacity planning has to be visited in some form or the other.
Estimating Size of Database Objects
Resource and performance capacity of the servers is one side of the puzzle. Equally important is to size/estimate the
database for storage and the data growth. This would mean the database, the database objects, and the underlying
storage subsystem would also have to be sized for today and tomorrow.
 
Search WWH ::




Custom Search