Information Technology Reference
In-Depth Information
4.2 Regression analysis
Gent and Walsh make an empirical study of the GSAT algorithm, it is an approximation
algorithm for SAT, and they apply regression analysis to model the growth of the cost of
obtaining the solution with the problem size (Gent 1997).
In (Cruz 1999), PĂ©rez and Cruz present a statistical method to build algorithm performance
models, using polynomial functions, which relate the performance with the problem size.
This method first generates a representative sample of the algorithms performance, and then
the performance functions are determined by regression analysis, which finally are
incorporated in an algorithm selection mechanism. The polynomial functions are used to
predict the best algorithm that satisfies the user requirements.
The performance of local search algorithms Novelty and SAPS for solving instances of the
SAT problem were analyzed by (Hutter 2006). The authors used linear regression with
linear and quadratic basis functions to build prediction models. Firstly, they built a
prediction model, using problem features and algorithm performance, to predict the
algorithm run time. Secondly, they build another prediction model, using problem features,
algorithm parameter settings and algorithm performance. This model is used to
automatically adjust the algorithm's parameters on a per instance basis in order to optimize
its performance.
4.3 Functions of probability distribution
Frost finds that the performance of the algorithms to solve CSP instances can be
approximated by two standard families of functions of continuous probability distribution
(Frost 1997). The resoluble instances can be modeled by the Weibull distribution and the
instances that are not resoluble by the lognormal distribution. He utilizes four parameters to
generate instances: number of constraints, number of prohibited value pairs per constraint,
the probability of a constraint existing between any pair of variables, the probability each
constraint is statistically independent of the others, and the probability that a value in the
domain of one variable in a constraint will be incompatible with a value in the domain of the
other variable in the constraint.
Hoos and Stuzle present a similar work to Frost. They find that the performance of
algorithms that solve the SAT instances can be characterized by an exponential distribution
(Hoos 2000). The execution time distribution is determined by the execution of k times of an
algorithm over a set of instances of the same family, using a high time as stop criteria and
storing for each successful run the execution time required to find the solution. The
empirical distribution of the execution time is the accumulated distribution associated with
these observations, and it allows projecting the execution time t (given by the user) to the
probability of finding a solution in this time. A family is a set of instances with the same
value of the parameters that are considered critical for the performance.
An algorithm portfolio architecture was proposed in (Silverthorn 2010). This architecture
employs three core components: a portfolio of algorithms; a generative model, which is fit to
data on those algorithms past performance, then used to predict their future performance;
and a policy for action selection, which repeatedly chooses algorithms based on those
predictions. Portfolio operation begins with offline training, in which a) training tasks are
Search WWH ::




Custom Search