Civil Engineering Reference
In-Depth Information
B.2.3
Experimental Design and Analysis
The goal of this work is to understand the effects of ʻ , ʳ , and with respect to
learning convergence and performance; this work is not aimed at optimizing (i.e.,
tuning) parameter settings. This study was based on a single experimental design
with a two-stage analysis. The first analysis is aimed at assessing network conver-
gence over a large parameter space. A full factorial experiment (
D 1 ) is run with the
following continuous level settings for each parameter: ʻ over [0 . 1, 0 . 9] incremented
by 0.1, ʳ over [0 . 95, 0 . 99] incremented by 0.01, and
. This ex-
periment therefore consists of 135 factor-level combinations, with 10 replications at
each factor-level combination. The outcome for this experiment is a binary variable
indicating (empirical) convergence; recall that convergence requires that the network
converge during both training and testing. A logistic regression (LR) model is then
created to estimate the probability of convergence based on ʻ , ʳ , and in
={
0 . 7, 0 . 8, 0 . 9
}
D 1 .
The second analysis aims to determine the effects of ʻ , ʳ , and on performance
over a smaller parameter space in which the network frequently converges. The
smaller parameter space
D 2 is a subset of and is extracted from
D 1 (
D 2
D 1 ),
where
D 2 consists of the following level settings: ʻ
={
0 . 6, 0 . 7, 0 . 8
}
, ʳ
=
{
}
={
}
0 . 97, 0 . 98, 0 . 99
, and
0 . 7, 0 . 8, 0 . 9
. These factor levels were chosen after
D 1 . This design is a 3
×
assessing network convergence over
3 full factorial design,
with 10 replications at each of the 27 factor-level combinations. Analysis of variance
(ANOVA) with Type II sums of squares is used to determine if ʻ , ʳ , and (and
their interactions) has significant effects on the convergence speed (i.e., episode at
which training converged) and on the mean testing performance. Non-convergent
runs are qualified as undefined responses, as opposed to missing data, and these runs
are removed from the data for the analysis, resulting in unbalanced groups and the
need for Type II sums of squares.
B.3
Results
Experimental design
D 1 resulted in 77.85 % (1051/1350) of the runs converging dur-
ing training, and 48.59 % (656/1350) converging based on both training and testing
convergence criteria. The proportion of times that unique factor-level combinations
converged ranged from 0/10 to 10/10, confirming that some regions of the parameter
space that are clearly better than others. Figure B.1 shows the empirical probabilities
of convergence over
D 1 . A LR model was created to estimate network convergence
using linear, quadratic, and interaction terms (Table B.1 ) , and nearly all terms have
statistically significant coefficients. The LR model was used because it provides
a compact functional form for predicting convergence in this application, though
other function approximators, such as neural networks, could be used to model the
convergence probability.
Search WWH ::




Custom Search