Information Technology Reference
In-Depth Information
it matches the deployment granularity. You can't buy half a machine, so capacity planning
doesn't need to be super precise.
To identify such a relationship between a core driver and resource consumption, you
first need to understand which core drivers influence which resources and how strongly.
The way to do so is to correlate the resource usage metrics with the core driver metrics.
Correlation
Correlation measures how closely data sources resemble each other. Visually, you might
see on a monitoring graph that an increase in CPU usage on a server matches up with a
corresponding increase in network traffic to the same server, which also matches up with a
spike in QPS. From these observations you might conclude that these three measurements
arerelated,althoughyoucannotnecessarily saythatthechangesinonecausedthechanges
in another.
Regression analysis mathematicallycalculateshowwelltime-seriesdatasourcesmatch
up. Regression analysis of your metrics can indicate how strongly changes in a core driver
affect the usage of a primary resource. It can also indicate how strongly two core drivers
are related.
To perform a regression analysis on time-series data, you first need to define a time in-
terval, such as 1 day or 4 weeks. The number of data samples in that time period is n . If
your core driver metric is x and your primary resource metric is y , you first calculate the
sumofthelast n valuesfor x , x 2 , y , y 2 ,and x times y ,givingΣ x x 2 y y 2 ,andΣ xy .Then
calculate SS xy , SS xx , SS yy , and R as follows:
Search WWH ::




Custom Search