Geology Reference
In-Depth Information
capabilities through success stories published in a wide range of articles. Abbott and
Vojinovic [ 2 ] introduced a new role for hydroinformatics in its sociotechnical
environment, developing the concept introduced by Abbott [ 3 ]. However,
researchers working in hydroinformatics are still struggling to get full-scale accep-
tance within the hydrological community, which is dominated by larger groups of
traditionalists who care less about data and more about physics. Some argue against
this section of hydrology, saying it adds no scienti
c knowledge or improved
understanding to the
field of physical modeling of hydrology. However, many
studies have clearly shown the capabilities of hidden nodes of arti
cial neural net-
works to communicate the real physics involved in the process [ 37 , 81 ]. The
capabilities of new concepts such as Genetic Programming are worth mentioning on
this occasion, having great potential to provide us with new hydrological knowledge
[ 24 ]. Some traditional hydrologists argue over generally adopted thumb rules and
assumptions during training and modeling, an obstacle to the wider acceptance of
this new stream. Although hydroinformatics and data-driven modeling have been in
use for more than two decades, it is struggling to
find full acceptance within the
hydrological community, which is dominated by large groups of traditional
hydrologists because of inherent problems in these models (e.g., chances of over-
fitting, redundancy of input, lack of modeling rigor, lack of transparency in repro-
ducing results, uncertainty issues, etc.). Some studies [ 25 , 49 , 71 ] suggested better
modeling frameworks and guidelines in data-based modeling. Some of the modeling
shortcomings and ambiguity in such data-based models are discussed below.
Elshorbagy et al. [ 24 ] argue that most data-based studies are a
'
less-than-compre-
hensive approach
focusing on (1) one or two data sets or application models [ 6 ]and
(2) random realization of the three subsets for modeling; which makes the generali-
zation ability of that model questionable. Elshorbagy et al. [ 22 ], See and Openshaw
[ 64 ] and Abrahart et al. [ 6 ] have reminded the hydroinformatics research community
of the need to maintain scienti
'
c rigor in the application and use of data-driven
techniques in hydrology and environmental sciences. The fundamental means to
assess the capability of any novel approach or modeling technique is to evaluate it
against other modeling techniques or approaches under different modeling conditions
or data sets. Elshorbagy et al. [ 23 ] has noted that most modeling comparative studies in
the literature of data-based modeling hydrology are highly impaired due to the less-
than-comprehensive approaches adopted. Single realization of the data set and single
case study makes it dif
cult to assess the actual capability of the novel concept such as
the Gamma Test. All new techniques should be evaluated against available basic
models (linear regression) and complex models (SVMs or wavelet SVMs).
2.2 Why Over
tting and How to Avoid
Over
tting or overtraining is a statistical phenomenon associated with nonlinear
data-based models when a model is generally complex with too many degrees of
freedom in relation to the amount of data available. The predictive models used in
 
Search WWH ::




Custom Search