Information Technology Reference
In-Depth Information
Fig. 1. Two examples for confidence interval development for growing number of samples (nor-
mal distribution with mean 20 and standard deviation 2)
be more difficult as the real underlying distribution is not known and if two simula-
tion model variants are compared, it is not clear from the beginning if the distributions
of their results differ. In the next subsection we present an analysis of statistical prop-
erties before we actually introduce our approaches to significance estimation, namely
convergence classification and replication prediction.
4.1
Analysis of Statistical Properties
If we take a look at different successively drawn samples of distributions, we can see
an interesting development of values. Figure 1 shows two developments of values from
the same distribution (normal distribution with mean 20 and standard deviation 2). The
solid blue line shows the estimated mean value using a specific number of sample val-
ues. The dashed light blue line shows the confidence interval. It is known that we need
four times as many samples in order to halve the size of the confidence interval (e.g.,
[11]). It can be seen that in one case the mean of the sample is below the actual expected
value of the distribution (left part of Figure 1) while in the other case, the line comes
close to the actual expected value rather quickly (right part of Figure 1).
Figure 2 shows the development curves of p values of performed t-tests on varying
sample sizes. In these graphs, we can see two curves: One where the compared samples
are actually drawn from different distributions (blue line; mean 20, stdev 2 vs. mean
21, stdev 3) and another where both compared samples are drawn from the same distri-
bution (dashed red line; mean 20, stdev 2). The two distributions have been selected to
have a good overlap in the values on purpose in order to take a look at samples where
the difference is not obvious after drawing a few examples. Interestingly, it can be seen
(e.g., in the right part of Figure 2) that for these distributions in some cases the graphs
can be hardly distinguished (for less than 100 samples for each distribution).
Additionally to the graphs comparing two single samples, the average p values of
100 runs are plotted in Figure 3. As it can be seen, the p values of identical distributions
 
Search WWH ::




Custom Search