A Quantitative Analysis into the Economics of Correcting Software Bugs - Computational Intelligence in Security for Information Systems

Information Technology Reference

In-Depth Information

(

)

()

(

)

e α−−

= − (3)

With continuous development, an added function to model the ongoing addition of

code is also required. Each instantaneous additional code segment (patch fix or fea-

ture) can be modeled in a similar manner.

What we do not have is the decay rate and we need to be able to calculate this. For

software with a large user-base that has been running for a sufficient epoch of time,

this is simple.

This problem is the same as having a jar with an unknown but set number of red

and white balls. If we have a selection of balls that have been drawn, we can estimate

the ratio of red and white balls in the jar.

Likewise, if we have two jars with approximately the same number of balls in ap-

proximately the same ratio, and we add balls from the second jar to the first periodi-

cally, we have a most mathematically complex and difficult problem, but one that has

a solution.

This reflects the updating of existing software. In addition, with knowledge if the

defect rates as bugs are patched (that is the rate of errors for each patch), we can cal-

culate the expected numbers of bugs over the software lifecycle. In each case, the

number of bugs from each iteration of patching added 34% ± 8% more bugs than the

last iteration.

N t

()

∑

()

A t

...

A t

(5)

()

≈

...

≤

In the study, this would come to

()

∑

()

(6)

(0.34)

1.514

So over the life of the software, there are 1.51 times the original number of bugs

that are introduced through patching.

Where we have a new software product, we have prior information. We can calcu-

late the defect rate per SLOC, the rate for other products from the team, the size of the

software (in SLOC) etc. This information becomes the posterior distribution. This is

where Bayesian calculations [11] are used.

time

λ B

(Mean) Number of Bugs / TLOC (Thousand Lines of Code)

SLOC (Source Lines of Code)

So, more generally, if a software release has L lines of code and the expected number

of lines of code per defect is λ B , then the a priori distribution of defects in the release

is a Poisson P β distribution where β is the ratio of new lines of code to average num-

ber of lines/bug (L/ λ B )

−

(7)

(

)

defects

Computational Intelligence in Security for Information Systems

Search WWH ::

Custom Search

Home