Search Based Software Engineering: Techniques, Taxonomy, Tutorial - Empirical Software Engineering and Verification

Information Technology Reference

In-Depth Information

solution in the solution space. There are many other considerations when it

comes to representation, but for a first approach, this should be sucient to

contend with.

Perhaps for the first experiments, the most important thing is to get some

results. These can always be treated as a baseline, against which you measure

the progress of subsequent improvements, so 'getting it a bit wrong' with the

initial choices need not be a wasted effort; it may provide a way to assess the

improvements brought to the problem by subsequent development.

Step 2: Once there is a way to represent candidate solutions, the next step is to

chose a fitness function. There may be many candidate fitness functions. Choose

the simplest to understand and the easiest to implement to start off with. Once

again, this may provide a baseline against which to judge subsequent choices.

Ideally, the fitness function should also be one that your implementation will

be able to compute inexpensively, since there will be a need for many fitness

evaluations, regardless of the search technique adopted.

Step 3: In order to ensure that all is working, implement a simple random

search. That is, use a random number generator to construct an entirely arbitrary

set of candidate solutions by sampling over the representation at random. This

allows you to check that the fitness function is working correctly and that it

can be computed eciently. One can also examine the spread of results and

see whether any of the randomly found solutions is any good at solving the

problem. Despite its simplicity, random search is simple and fast and many

researchers have found that it is capable of finding useful solutions for some

Software Engineering applications (for instance, in testing, where it has been

very widely studied [55, 79]).

Step 4: Having conducted steps 1-3 we are in a position to conduct the first

application of a search based technique. It is best to start with hill climbing. This

algorithm is exceptionally easy to implement (as can be seen from Section 4). It

has the advantage that is is fast and conceptually easy to understand. It can also

be used to understand the structure of the search space. For instance, one can

collect a set of results from multiple restart hill climbing and examine the peaks

reached. If all peaks found are identical then hill climbing may have located a

large 'basin of attraction' in the search space. If all peaks are of very different

heights (that is, they have different fitness values) then the search space is likely

to be multimodal. If many hill climbs make no progress then the search space

contains many plateaux (which will be a problem for almost any search based

optimisation technique).

It is possible that simply using hill climbing, the results will be acceptable.

If this is the first time that SBSE has been applied to the Software Engineering

problem in hand, then the results may already be at the level at which publi-

cation would be possible. If the peaks are all acceptable but different then the

approach already solves the problem well and can give many diverse solutions.

This was found to be the case in several software engineering applications. For

instance, in software modularisation [68], the hill climbing approach produced

Empirical Software Engineering and Verification

Search WWH ::

Custom Search

Home