Information Technology Reference
In-Depth Information
TABLE 6.3
Examples of Parameters and Statistics
Measure
Parameter
Statistics
Mean
µ
X
Standard deviation
s
Proportion
p
Correlation
ρ
r
interpreting data, displaying data, and making decisions based on data. The term
“statistic” refers to the numerical quantity calculated from a sample of size n . Such
statistics are used for parameter estimation.
In analyzing outputs, it also is essential to distinguish between statistics and pa-
rameters. Although statistics are measured from data samples of limited size ( n ),
a parameter is a numerical quantity that measures some aspect of the data popula-
tion. Population consists of an entire set of objects, observations, or scores that have
something in common. The distribution of a population can be described by several
parameters such as the mean and the standard deviation. Estimates of these param-
eters taken from a sample are called statistics. A sample is, therefore, a subset of a
population. As it usually is impractical to test every member of a population (e.g.,
100% execution of all feasible verification test scenarios), a sample from the popu-
lation is typically the best approach available. For example, the mean time between
failures (MTBF) in 10 months of run time is a “statistics,” whereas the MTBF mean
over the software life cycle is a parameter. Population parameters rarely are known
and usually are estimated by statistics computed using samples. Certain statistical
requirements are, however, necessary to estimate the population parameters using
computed statistics. Table 6.3 shows examples of selected parameters and statistics.
6.3.1
Descriptive Statistics
One important use of statistics is to summarize a collection of data in a clear and un-
derstandable way. Data can be summarized numerically and graphically. In numerical
approach, a set of descriptive statistics are computed using a set of formulas. These
statistics convey information about the data's central tendency measures (mean, me-
dian, and mode) and dispersion measures (range, interquartiles, variance, and standard
deviation). Using the descriptive statistics, data central and dispersion tendencies are
represented graphically (such as dot plots, histograms, probability density functions,
steam and leaf, and box plot).
For example, a sample of an operating system CPU usage (in %) is depicted in
Table 6.4 for some time. The changing usage reflects the variability of this variable
that typically is caused by elements of randomness in current running processes,
services, and background code of the operating system performance.
The graphical representations of usage as an output help to understand the distribu-
tion and the behavior of such a variable. For example, a histogram representation can
be established by drawing the intervals of data points versus each interval's frequency
 
Search WWH ::




Custom Search