Environmental Engineering Reference
In-Depth Information
7.1 Introduction
(Jain, Mao and Mohiuddin, 1996; Paola and Schowengerdt,
1997; Kavzoglu and Mather, 2003; Mas and Flores, 2008). These
inconsistencies justify further research on the algorithmic issues
in order to promote the routine use of neural networks in image
classification.
In this chapter, we review and assess a set of algorithmic
parameters affecting the performance of neural networks in image
classification. The chapter comprises several major components.
First, we introduce and review some fundamental aspects of
neural networks including network architectures and knowl-
edge representation. Second, we discuss two focused studies we
recently conducted with an urban-suburban area as the test site
to assess the sensitivity of neural networks with respect to vari-
ous internal parameter settings and the performance of several
training algorithms in image classification by neural networks.
Third, based on literature review and our own focused studies,
we propose a framework to guide the use of neural networks
in image classification, considering data acquisition and prepro-
cessing, network model design, training process, and validation
in a sequential mode. Finally, we identify several areas for future
research in order to improve the success of neural networks in
image classification.
A neural network is a massively parallel distributed processor
comprised of simple processing units, attempting to simulate
the powerful capabilities for knowledge acquisition, synthesis,
and problem solving of the human brain (Haykin, 1999). It
originated from the concept of artificial neuron introduced by
McCulloch and Pitts in 1943. Over the past several decades,
neural networks have evolved from the preliminary development
of artificial neuron, through the rediscovery andpopularizationof
the back-propagation training algorithm, to the implementation
of neural networks using dedicated hardware (Dawson and
Wilby, 2001). Because of the distributed structure and adaptive
learning process, neural networks are capable of handling non-
linear, complex phenomena; they can also be effective to process
incomplete, noisy and ambiguous data (Bishop, 1995). These
advantages make neural networks an attractive pattern classifier
(Duda, Hart and Stork, 2001).
The use of neural networks in remote sensing began in late
1980s (Atkinson and Tatnall, 1997; Kanellopoulos and Wilkin-
son, 1997). Over the past two decades, numerous studies have
demonstrated that neural networks can produce identical or
improved classification accuracies when compared to the out-
come from conventional classifiers (e.g., Benediktsson, Swain
and Ersoy, 1990; Bischof, Schneider and Pinz, 1992; Civco,1993;
Paola and Schowengerdt,1995a; Gopal andWoodcock, 1996; Ser-
pico, Bruzzone and Roli, 1996; Mannan, Roy and Ray, 1998; Ji,
2000; Seto and Liu, 2003; Del Frate et al ., 2007; Petropoulos et al .,
2010). Nevertheless, the performance of neural networks is con-
tingent upon a wide range of algorithmic and non-algorithmic
parameters, such as input data dimensionality, training data, net-
work structure, and learning process (Paola and Schowengerdt,
1995b; Foody and Arora, 1997; Kavzoglu and Mather, 2003;
Mas and Flores, 2008). With the incorporation of neural net-
works as a standard classifier in some popular image processing
software packages (see Mas and Flores, 2008), handling these
diverse parameters presents a challenge to beginners and even
some experienced users as an inappropriate treatment can lead
to suboptimal or unacceptable classification performance (Zhou
and Yang, 2010).
Investigating the sensitivity of neural networks with respect to
various parameter settings has been the subject in an increasing
number of studies since this knowledge is critical to the design of
efficient neural network models for improved performance (e.g.,
Korczak and Hammadimesmoudi, 1994; Yoshida and Omatu,
1994; Jarvis and Stuart, 1996; Kanellopoulos and Wilkinson,
1997; Ozkan and Erbek, 2002; Stathakis, 2009). These studies
have been conducted by using a trial-and-error approach (e.g.,
Paola and Schowengerdt, 1997; Kavzoglu and Mather, 2003) or
some advanced methods such as generic algorithm and pruning
algorithms (e.g., Kavzoglu and Mather, 2003; Benediktsson and
Sveinsson, 2003). As a result, some practical guidelines have been
proposed to deal with various non-algorithmic issues including
input data dimensionality and training sample quantity and
quality (e.g., Zhuang et al ., 1994; Foody, McCulloch, and Yates,
1995; Kanellopoulos andWilkinson, 1997; Mas and Flores, 2008).
Nevertheless, there is no consistent guidance to help configure
neural networks. Specific to the multi-layer-perceptron (MLP)
neural networks, for example, there is no consensus on the
number of hidden layers, type of activation functions, or training
parameters that should be used to achieve optimal performance
7.2 Fundamentals of
neural networks
This section will discuss neural network architectures with the
emphasis upon the MLP networks, along with neural training
methods.
7.2.1 Neural network types
There are two fundamentally different types of neural network
architectures: feed-forward networks and recurrent networks.
The former includes single-layer networks comprising an input
layer that projects onto an output layer as well as multilayer
networks having at least one hidden layer that allows the networks
to extract high-order statistics. A recurrent network distinguishes
itself from feed-forward networks by having at least one feedback
loop whose presence can greatly affect the training capability and
performance (Haykin, 1999).
Considering neural network structures and training
paradigms, we can find a large number of different types of
neural networks, and some of the most commonly used ones are
listed in Table 7.1. Each type has advantages and disadvantages
depending upon specific applications. Detailed discussions about
these neural network types are given elsewhere (e.g., Bishop,
1995; Jain, Mao and Mohiuddin, 1996; Rojas, 1996; Haykin,
1999; Principe, Euliano and Lefebvre, 2000). Here, we focused
on the MLP feed-forward networks due to their technological
robustness and overwhelming popularity.
The MLP neural networks are relatively easy to understand
and implement. As the workhorse of neural networks, they have
been increasingly used in remote sensing (cf. Mas and Flores,
2008). They comprise distributed neurons and weighted links
(Fig 7.1). Arranged in an input-hidden-output layered structure,
eachneuron contains a simple processing function (i.e., activation
Search WWH ::




Custom Search