Information Technology Reference
In-Depth Information
Fig. 4. Replication prediction using nonlinear regression
5
Evaluation
The evaluation consists of three parts. In the first part, the significance classification is
applied to distributions with fixed mean and standard deviation. The second part applies
the significance classification to randomly generated distributions. In the third part, we
apply the replication prediction to fixed distributions.
5.1
Significance Classification for Fixed Distributions
In the first experiment series, we apply the significance classification approach to sam-
ples drawn from different distributions with fixed mean and standard deviation. Alto-
gether, we set up three different distribution pairs which are evaluated. In our evaluation,
we investigate the classifier accuracy for varying numbers of p values ( 5 , 10 ,..., 95 )
taken into account for training and classification. For each distribution pair, ten inde-
pendent runs are performed where 500 training and 500 testing examples (50% same,
50% different distributions) are generated.
Table 1 shows a summary of the results indicating the average accuracy of the ap-
proach as well as the accuracy if simple comparison of the last p value with the α
threshold is performed, i.e., if p last , it will be classified to diff , otherwise to same .
Additionally, for each number of p values we perform a statistical significance test com-
paring the accuracies of the classifier vs. the α -threshold approach (ten accuracy values
each) and capture the corresponding p values of the test. Significant results are em-
phasized with bold letters. The accuracies for the second distribution pair ( μ 1 =20 ,
sd 1 =2 vs. μ 2 =22 , sd 2 =2 ) is shown in Figure 5.
 
Search WWH ::




Custom Search