Civil Engineering Reference
In-Depth Information
Table 3.7 Data set description
Name
Use
Origin
Size
Description
TR1
Training
GS1
9,702
Summer training set
TE1
Testing
GS1
9,702
Summer testing set
VA1a
Validation
Real data
1,440
Summer validation set, non-working day
VA1b
Validation
Real data
1,440
Summer validation set, working day
TR2
Training
GS2
9,702
Winter training set
TE2
Testing
GS2
9,702
Winter testing set
VA2a
Validation
Real data
1,440
Winter validation set, non-working day
VA2b
Validation
Real data
1,440
Winter validation set, working day
day where the PMV index value is under the comfort band. For the second data set
(VA2b) the PMV index value is inside it.
Note that two more data sets could be chosen for validation. One for summer,
where the real PMV is under the comfort band, and another data set for winter,
where the real PMV is above the comfort band. However, these cases are unusual
in the place where the CDdI-CIESOL-ARFRISOL building is located, Almería, and
no real data have been found to be used as data sets. A summary of the different data
sets used for training, testing and validation models is shown in Table 3.7 .
Once trained, the goodness of fit of any of the approximations
(neural network
or polynomial model analysed in the next section) is reported using the Root Mean
Square (RMS) error, computed over the samples of a particular set S and denoted as
e RMS
S
(
P
)
(
P
)
, thus:
card
(
S
)
y
) 2
1
card
e RMS
S
(
P
) =
(
i
) −ˆ
y
(
i
,
P
(3.10)
(
)
S
i
=
1
where y
(
i
)
stands for the correct value of the PMV for element i
S and
y
ˆ
(
i
,
P
)
is
the approximation given by the model P for that particular element.
In the training process, input variables can be observed in Table 3.6 . Training is
performed using a variable-step gradient descent process, namely the MATLAB's
implementation of the Levenberg-Marquardt algorithm (Moré 1978 ). The trainlm
function is used for, at most, 30 iterations over the TR. The number of iterations is
reduced to avoid overtraining. The usual procedure when using neural approxima-
tions is to normalise the components of the input vector. This is accomplished by
subtracting the mean value and dividing it by the standard deviation for each variable.
Once trained, the goodness of fit of a particular ANN, ( N NN ), is reported using the
RMS error, see Eq. 3.10 .
An ANN with insufficient nodes may be unable to reproduce the variations of
PMV in the data set. On the other hand, an ANN with more nodes than needed may
lead to overtraining and degrade the generalisation capabilities.
In order to select the most adequate network size, several ANNs are trained using
data from the TR, with different random initial parameters and with different values
 
 
Search WWH ::




Custom Search