Increasing the Accuracy of Software Fault Prediction Using Majority Ranking Fuzzy Clustering* - Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing

Information Technology Reference

In-Depth Information

AR3, AR4, and AR5 belong to Turkish white-goods manufacturer developing

embedded controller software. AR3 has 63 modules of which 13% of it is faulty;

AR4 is a larger data set with 107 rows and 19% faulty modules. The AR5 data set

has only 36 modules and 22% of them are faulty.

5.2 Performance Evaluation Metrics

After specifying each module's label in testing phase, each evaluation metrics is

calculated based on the confusion matrix. If a module's label predicted as non-

faulty but the actual label is faulty, we get the condition of false negative (FN). On

the other hand, if the non-faulty modules was labeled as faulty, we call it false

positive (FP). If the faulty module predicted as faulty and non-faulty module pre-

dicted as non-faulty, they are called true positive (TP) and true negative (TN)

respectively. FNR (false negative rate) is the percentage of modules that were

actually faulty but there were predicted as non-faulty. In contrast, FPR (false posi-

tive rate) is the percentage of modules that were actually non-faulty but there were

predicted as faulty. FNR, FPR, and errors need to be as small as possible in all the

experiments. The following equations are used to calculate FNR, FPR, and overall

error rate.

(4)

=

(5)

=

(6)

5.3 Hypothesis Formulation

Each hypothesis listed as follows is identified according to our research question:

•

HP1: Our proposed method performs better than naïve bayes and random

forest in identifying fault prone and not faults prone modules. (Null

Hypothesis: Our approach performs as well as naïve bayes and random

forest in identifying fault prone and not faults prone modules.)

•

HP2: Our proposed approach performed better than naïve bayes and

random forest when two-stage outlier removal is applied. (Null Hypothesis:

Our approach cannot outperform naïve bayes and random forest when two-

stage outlier removal is applied.)

•

HP3: Our proposed model performed well when it is applied on two

different sets of training and testing datasets. (Null Hypothesis: Our

proposed model does not perform well when it is applied on two different

sets of training and testing datasets.)

Search WWH ::

Custom Search

Home