Biomedical Engineering Reference
In-Depth Information
Results and Discussion
The effect of such variable selection can
clearly be seen in Figure 4. Within 200 epochs,
the 'selected variable combination' achieved a
much smaller MSE than using them all (0.067 vs.
0.115). This further demonstrates that not all of
the potential input variables are equally informa-
tive since some may be correlated ( U / u *, B / H , β ),
noisy or have no significant relationship ( α ) with
the longitudinal dispersion coefficient.
By setting α = 0.04, number of neurons in the
hidden layer = 3, running the MLP several times,
the minimum Root Mean Square Error (RMSE)
for the training data is 34.85, the coefficient of
determination ( R 2 ) is 0.96 at N e = 19887. This im-
plies that the MLP model is satisfactorily trained.
A plot of the measured and predicted longitudinal
dispersion coefficients can be seen in Figure 5.
It is evident that the data is evenly distributed
around the ' y=x ' line.
Comparing the above results to those in Tay-
fur and Singh (2005) ( N e = 20000, R 2 = 0.90),
GNMM performs better. Although MLPs are
being adopted in both applications, the difference
lies in the fact that only a portion of available
variables are used in GNMM instead of using
them all as in Tayfur and Singh (2005). For the
test data, a comparison has also been made with
some other models, as in Table 3, which also
shows that GNMM produces the best results,
and ANN models (GNMM and MLP (Tayfur &
Singh, 2005)) generally perform better.
The final weights and biases of the MLP that
minimizes MSE are as follows: 3 θ = −0.6031,
2 θ 1 = 1.4022, 2 θ 2 = −0.0143, 2 θ 3 = −4.1393, 3 w =
(−1.7705, 0.8517, −1.2564), 2 w 1 = (4.1222, 0.9600,
−1.5078), 2 w 2 = (5.7385, −4.3290, 1.1943), 2 w 3 =
(−0.7147, −6.7842, 0.3987). Returning to equation
(10), we have:
GNMM is implemented in MATLAB (v7.2),
mainly using the Genetic Algorithm and Direct
Search Toolbox, as well as the Neural Network
Toolbox. GAs are configured to run five times to
explore different combinations of input variables.
The configuration for each GA is shown in Table
2, along with CPU speed and CPU time. It should
be noted that N e in Table 2 stands for 'number of
epochs per chromosome'. Also note that in all the
cases p c = 0.8, p m = 0.01 and α = 0.01 as these were
empirically found suitable for the problem.
As mentioned previously, GAs serve as an
input variable selection tool in GNMM. From
this point of view, GNMM belongs to a broader
concept of evolutionary ANNs (Yao, 1999), i.e.
GAs evolve to an optimal set of input variables,
and hence the structure of the ANN. A three-
layer MLP is used to evaluate the fitness of each
chromosome, with input/output neurons being
the same as the number of input/target variables
(the target is the longitudinal dispersion coef-
ficient). For simplicity, the number of neurons in
the hidden layer is made the same as of the input
layer, i.e. the number of the input variables. The
assessment of the fitness of a chromosome is
the Mean Squared Error (MSE) when this MLP
is being trained with the selected input variable
combination and the corresponding longitudinal
dispersion coefficient.
After running the GAs five times, a clear
distinction was evident between variables. A
variable's appearance percentage was defined
as the accumulated appearance in the winning
chromosome (minimizing MSE) of each popula-
tion within a GA divided by the total number of
generations. The average appearance percentage
of each variable in the five GAs is illustrated in
Figure 3. It can be seen that the most frequently
appearing variables are U (99%), B (96%) and H
(70%), followed by u * (28%), U / u * (26%), whereas
β , α and B / H are all less than 2%. Thus U , B and
H are used during the final MLP training.
(11)
Search WWH ::




Custom Search