Environmental Engineering Reference
In-Depth Information
Data pre-processing is necessary as it can have a significant
effect on neural network performance (Foody, McCulloch and
Yates, 1995; Foody and Arora, 1997; Mas and Flores, 2008).
There are two major tasks during this phase. First, it is necessary
to determine which image bands should be actually included as
the input data. If large image scenes or many image bands are
being considered, a data dimensionality reduction technique like
principal component analysis (e.g., Liu andLathrop, 2002) should
be used to extract salient features prior to the actual classification.
This procedure can greatly reduce the computational burden.
The second task is to identify the training, test, and validation
datasets by using reference data in combination with some image
interpretation procedures. The training dataset is for network
training, the test dataset is used to assess the performance of the
network at the training stage for cross-validation purposes, and
the validation dataset is used to evaluate the performance of a
network against independent data. Both the test and validation
datasets should be much smaller in size when comparing to the
training set but each dataset should be representative of the same
population. If the available dataset is limited, the division of data
may be difficult, and some other methods, such as bootstrapping
(Kohavi, 1995) or the hold-out method (Masters, 1995), could
be attempted to maximize utilization of the available data.
Prior to training, it is important to define an appropriate neu-
tral network architecture and training parameters. Begin with a
multilayer perceptron neural network and a back-propagation
learning algorithm as the benchmark to evaluate any other net-
work types and learningmethods. Specify an appropriate number
of hidden layers and nodes unless a pruning algorithm or cascade
correlation is used. Begin with one hidden layer as a starting
point. Choose either logistic sigmoid or hyperbolic tangent func-
tion as the activation function. Also choose appropriate values
for learning parameters. As demonstrated in the two focused
studies, a number of trial-and-error experiments may need in
order to optimize the network architecture. The initial weights
should be randomly chosen. Use the training data set in the train-
ing, and the test set for cross-validation in order to determine
when to terminate the training process. Neural networks training
may adopt a strategy to avoid overtraining by calculating the
classification error of a test dataset on each iteration, and once
the error goes up, then stop training. In our focused studies, the
conditions of stopping training were defined by the training goal
and other parameters, such as minimum gradient size. Once the
training is completed, save the weights and architecture of the
neural model, and classify the image to produce a land use/cover
map. The classification performance is assessed by using the
independent validation data through the error matrix method
described earlier.
internal parameters are needed. Secondly, data preprocessing is
an area where more guidance is needed. There are many theo-
retical assumptions that have not been confirmed by empirical
trials. It is not clear how different pre-processing methods could
affect the classification outcome. Future investigation is needed
to explore the impact of data quality and different methods
in data division, data standardization, or data reduction upon
the land classification. Lastly, continuing research is needed to
develop effective strategies and probing tools for mining the
knowledge contained in the connection weights of trained neural
network models in image classification. This can help uncover
the 'black-box' construction of the neural network, which can
help improve the success of neural network applications to image
classification.
References
Anderson,J.R.,Hardy,E.E.,Roach,J.T.andWitmer,R.E.
(1976) A Land Use and Land Cover Classification System for
Use with Remote Sensor Data , USGS Professional Paper 964,
Sioux Falls, SD, USA.
Atkinson, P. M. and Tatnall, A. R. L. (1997) Introduction: neural
networks in remote sensing. International Journal of Remote
Sensing , 18 , 699-709.
Benediktsson, J. A. and Sveinsson, J. R. (1997) Feature extraction
for multisource data classification with artificial neural net-
works. International Journal of Remote Sensing , 18 , 727-740.
Benediktsson, J. A. and Sveinsson, J. R. (2003) Multisource
remote sensing data classification based on consensus and
pruning. IEEE Transactions on Geoscience and Remote Sensing ,
41 , 932-936.
Benediktsson, J. A., Swain, P. H. and Ersoy, O. K. (1990) Neural
network approaches versus statistical methods in classification
of multisource remote sensing data. IEEE Transactions on
Geosciences and Remote Sensing , 28 , 540-551.
Bishop, C. M. (1995) Neural Networks for Pattern Recognition ,
Oxford University Press, Oxford, UK.
Bischof, H., Schneider, W. and Pinz, A. J. (1992) Multispectral
classification of Landsat-images using neural networks. IEEE
Transactions on Geoscience and Remote Sensing , 30 , 482-490.
Civco, D. L. (1993) Artificial neural networks for land-cover clas-
sification and mapping. International Journal of Geographical
Information Systems , 7 , 173-186.
Congalton, R. G. (1991) A review of assessing the accuracy
of classification of remotely sensed data. Remote Sensing of
Environment , 37 , 35-46.
Dawson, C. W. and Wilby, R. L. (2001). Hydrological modelling
using artificial neural networks. Progress in Physical Geography ,
25 , 80-108.
Demuth, H., Beale, M. and Hagan, M. (2008) Neural Network
Toolbox TM User's Guide ,TheMathWorks,Inc.,Natick,MA.
Duda, R. O., Hart, P. E. and Stork, D. G. (2001) Pattern Classifi-
cation , John Wiley & Sons, Inc, New York.
Del Frate, F., Pacifici, F., Schiavon, G. and Solimini, C. (2007)
Use of neural networks for automatic classification from
Future research directions
There are a few areas where further research is needed. Firstly,
there are many arbitrary decisions involved in the construction of
a neural network model, and therefore, there is a need to develop
guidance that helps identify the circumstances under which par-
ticular architectures should be adopted and how to optimize the
parameters that control them. For this purpose, more empiri-
cal, intermodel comparisons, and rigorous assessment of neural
network performance with different inputs, architectures, and
Search WWH ::




Custom Search