Hardware Reference
In-Depth Information
4.2.3
Over-Fitting
In statistics, over-fitting occurs when a statistical model captures systematically
the random error (or noise ) together with the underlying analytical relationship.
Over-fitting generally occurs when the model complexity is excessive. This happens
whenever the model has too many degrees of freedom (i.e., the size of vector w ), in
relation to the amount of data available. An over-fitting model, has poor predictive
capabilities, as it can exaggerate minor sweeps in the data.
There are several methods to avoid the over-fitting risk; in the MULTICUBE
project we employed techniques such as model selection to identify the simplest
models (in terms of size of w ) which can guarantee a reasonable error on the training
set, and the early stopping criterion . The early stopping criterion consists of splitting
the training data into two sets: a training set and a validation set . The samples in the
training set are used to train the model, by decreasing both the error on the training
data and on the validation data. The training algorithm stops as soon as the error on
the validation set starts to increase.
4.3
How to Manage the Design Space of Embedded Systems
Problems emerging in the design of embedded computing systems present some
characteristic features—such as the fact that all configuration parameters are discrete
and possibly categorical—that deserves further discussion.
4.3.1
Discrete and Categorical Variables
SoC design problems are characterized by the fact that all configuration parameters
are discrete and possibly categorical:
￿
Discrete variables quantify data with a finite number of values. There is a clear
order relation between different values.
￿
Categorical (or nominal) variables classify data into categories. These are qual-
itative variables, in which the order of different values (categories) is totally
irrelevant.
On the contrary, traditional RSM techniques usually deal with continuous de-
sign spaces. For this reasons, RSM algorithms simply ignore the discrete and/or
categorical nature of variables, treating them as continuous ones.
In general this is not a pressing problem as regards discrete variables, given the
order relation existing between different values. Even if the trained RSM would
be able to predict the model “in the middle”, this unrequested generalization will
never be implemented in practice, given the discrete nature of variables in evaluation
points.
Search WWH ::




Custom Search