Geology Reference
In-Depth Information
Chapter 3
Model Data Selection and Data
Pre-processing Approaches
Abstract Data-based modeling relies on historical data without directly taking
account of underlying physical processes in hydrology. So, real-world modeling of
hydrological processes commonly requires a complex input structure and very
lengthy training data to represent inherent complex dynamic systems. In cases where
a large amount of input data is available, and all of which used for modeling, technical
issues such as the increase in the computational complexity and lack of memory
spaces have been observed. The likelihood of these problems occurring is much
greater in the case of hydrological modeling, as these models possess high nonlin-
earity and a large number of parameters. Therefore, there is a de
nite need to identify
proper techniques which adequately reduce the number of inputs and the required
training data length in nonlinear models. Removing redundant inputs from all
available input pools and deciding upon the optimum data length to make a reliable
prediction are the main purposes of these approaches. This section of the topic
describes the abilities of novel techniques such as Gamma Test (GT), entropy theory
(ET), Principle Component Analysis (PCA), cluster analysis (CA), Akaike
s Infor-
mation Criterion (AIC), and Bayesian Information Criterion (BIC) in model data
selection. The novelty of this work is that many of these approaches are used for the
'
first time in hydrological modeling scenarios such as solar radiation estimation,
rainfall-runoff modeling, and evapotranspiration modeling. Towards the end of this
chapter, conventional data selection procedures such as the Cross-Correlation
Approach (CCA), Cross-Validation Approach (CVA), and Data Splitting Approach
(DSA) are explained in detail. These traditional approaches were used to check the
authenticity of the newly applied methods in the later case study chapters.
3.1 Implementation of Gamma Test
Gamma Test (GT) is a nonlinear modeling analysis technique which helps to
quantify the extent to which numerical input
output data can be expressed as a
reliable smooth model. The distinct advantage of GT is its ability to calculate
-
 
Search WWH ::




Custom Search