Geology Reference
In-Depth Information
hydrology is Olden and Poff [ 54 ]. They have applied Jittering to tackle
redundancy in hydrologic indices of long-term
flow records from 420 sites from
across the continental USA. Jittering is also related to regularization methods
such as weight decay and ridge regression.
3. Early stopping: In this approach, we need to stop training processes just before an
adaptation to the noise starts. The optimal stopping time can be found using test
data. In other words, the modeler requires three subsets of data (training, test, and
veri
cation). At much later stages of the modeling process the prediction accu-
racy of the model may start worsening for the test set. This is the stage when the
model should cease to be trained to overcome the over-
tting problem.
4. Weight decay: Weight-decay reduces the effect of noise associated with the
inputs.
5. Bayesian learning: The conventional training statistical approaches are replaced
by Bayesian statistics.
This approach involves modi
cation of general objective functions, such as the
mean sum of squared network errors (MSE or E m ):
N X
N
i¼1 ð
1
2
e i Þ
ð 2 : 1 Þ
F
¼
E m ¼
MSE m ¼
F
¼ b
E m þ a
E d
ð 2 : 2 Þ
The modi
cation of MSE is to improve the generalization capability to avoid
over
tting. The above equation will be modi
ed to a new F value by adding a
are to be optimized by a Bayesian
framework. It is usually assumed that the weights and biases of the network are
random variables following Gaussian distributions, with enormous computa-
tions required.
6. Use of small networks: If parameters in the training network are less than the
objects in the training set, it cannot be overtrained unless the tackled case is too
complex.
7. Pruning: The method for reducing the size of a network just after the training
process. This approach helps to detect redundant neurons which cause delay in
modeling. In the case of pruning network modeling, the training process starts
with a large, densely connected network, and then examines the trained net-
work
new E d term. In ( 2.2 ) the parameters
β
and
α
s performance to assess the relative importance of network weights. After
that, the pruning algorithm removes the least important weight/node from the
network and performs analysis on the new pruned network. This procedure
continues till the modeler is happy with the results.
8. Data enrichment: An approach for arti
'
cially enlarging the training set by
cial data. This process is not found to be effective in all cases.
9. Regularization: In this method, we add a penalty term to the optimization cri-
terion for networks with large weights, as these are related to strong
nonlinearity.
arti
 
Search WWH ::




Custom Search