Parameterizing neural network models to improve land classification performance - Urban Remote Sensing: Monitoring, Synthesis and Modeling in the Urban Environment

Environmental Engineering Reference

In-Depth Information

7.1 Introduction

(Jain, Mao and Mohiuddin, 1996; Paola and Schowengerdt,

1997; Kavzoglu and Mather, 2003; Mas and Flores, 2008). These

inconsistencies justify further research on the algorithmic issues

in order to promote the routine use of neural networks in image

classification.

In this chapter, we review and assess a set of algorithmic

parameters affecting the performance of neural networks in image

classification. The chapter comprises several major components.

First, we introduce and review some fundamental aspects of

neural networks including network architectures and knowl-

edge representation. Second, we discuss two focused studies we

recently conducted with an urban-suburban area as the test site

to assess the sensitivity of neural networks with respect to vari-

ous internal parameter settings and the performance of several

training algorithms in image classification by neural networks.

Third, based on literature review and our own focused studies,

we propose a framework to guide the use of neural networks

in image classification, considering data acquisition and prepro-

cessing, network model design, training process, and validation

in a sequential mode. Finally, we identify several areas for future

research in order to improve the success of neural networks in

image classification.

A neural network is a massively parallel distributed processor

comprised of simple processing units, attempting to simulate

the powerful capabilities for knowledge acquisition, synthesis,

and problem solving of the human brain (Haykin, 1999). It

originated from the concept of artificial neuron introduced by

McCulloch and Pitts in 1943. Over the past several decades,

neural networks have evolved from the preliminary development

of artificial neuron, through the rediscovery andpopularizationof

the back-propagation training algorithm, to the implementation

of neural networks using dedicated hardware (Dawson and

Wilby, 2001). Because of the distributed structure and adaptive

learning process, neural networks are capable of handling non-

linear, complex phenomena; they can also be effective to process

incomplete, noisy and ambiguous data (Bishop, 1995). These

advantages make neural networks an attractive pattern classifier

(Duda, Hart and Stork, 2001).

The use of neural networks in remote sensing began in late

1980s (Atkinson and Tatnall, 1997; Kanellopoulos and Wilkin-

son, 1997). Over the past two decades, numerous studies have

demonstrated that neural networks can produce identical or

improved classification accuracies when compared to the out-

come from conventional classifiers (e.g., Benediktsson, Swain

and Ersoy, 1990; Bischof, Schneider and Pinz, 1992; Civco,1993;

Paola and Schowengerdt,1995a; Gopal andWoodcock, 1996; Ser-

pico, Bruzzone and Roli, 1996; Mannan, Roy and Ray, 1998; Ji,

2000; Seto and Liu, 2003; Del Frate et al ., 2007; Petropoulos et al .,

2010). Nevertheless, the performance of neural networks is con-

tingent upon a wide range of algorithmic and non-algorithmic

parameters, such as input data dimensionality, training data, net-

work structure, and learning process (Paola and Schowengerdt,

1995b; Foody and Arora, 1997; Kavzoglu and Mather, 2003;

Mas and Flores, 2008). With the incorporation of neural net-

works as a standard classifier in some popular image processing

software packages (see Mas and Flores, 2008), handling these

diverse parameters presents a challenge to beginners and even

some experienced users as an inappropriate treatment can lead

to suboptimal or unacceptable classification performance (Zhou

and Yang, 2010).

Investigating the sensitivity of neural networks with respect to

various parameter settings has been the subject in an increasing

number of studies since this knowledge is critical to the design of

efficient neural network models for improved performance (e.g.,

Korczak and Hammadimesmoudi, 1994; Yoshida and Omatu,

1994; Jarvis and Stuart, 1996; Kanellopoulos and Wilkinson,

1997; Ozkan and Erbek, 2002; Stathakis, 2009). These studies

have been conducted by using a trial-and-error approach (e.g.,

Paola and Schowengerdt, 1997; Kavzoglu and Mather, 2003) or

some advanced methods such as generic algorithm and pruning

algorithms (e.g., Kavzoglu and Mather, 2003; Benediktsson and

Sveinsson, 2003). As a result, some practical guidelines have been

proposed to deal with various non-algorithmic issues including

input data dimensionality and training sample quantity and

quality (e.g., Zhuang et al ., 1994; Foody, McCulloch, and Yates,

1995; Kanellopoulos andWilkinson, 1997; Mas and Flores, 2008).

Nevertheless, there is no consistent guidance to help configure

neural networks. Specific to the multi-layer-perceptron (MLP)

neural networks, for example, there is no consensus on the

number of hidden layers, type of activation functions, or training

parameters that should be used to achieve optimal performance

7.2 Fundamentals of

neural networks

This section will discuss neural network architectures with the

emphasis upon the MLP networks, along with neural training

methods.

7.2.1 Neural network types

There are two fundamentally different types of neural network

architectures: feed-forward networks and recurrent networks.

The former includes single-layer networks comprising an input

layer that projects onto an output layer as well as multilayer

networks having at least one hidden layer that allows the networks

to extract high-order statistics. A recurrent network distinguishes

itself from feed-forward networks by having at least one feedback

loop whose presence can greatly affect the training capability and

performance (Haykin, 1999).

Considering neural network structures and training

paradigms, we can find a large number of different types of

neural networks, and some of the most commonly used ones are

listed in Table 7.1. Each type has advantages and disadvantages

depending upon specific applications. Detailed discussions about

these neural network types are given elsewhere (e.g., Bishop,

1995; Jain, Mao and Mohiuddin, 1996; Rojas, 1996; Haykin,

1999; Principe, Euliano and Lefebvre, 2000). Here, we focused

on the MLP feed-forward networks due to their technological

robustness and overwhelming popularity.

The MLP neural networks are relatively easy to understand

and implement. As the workhorse of neural networks, they have

been increasingly used in remote sensing (cf. Mas and Flores,

2008). They comprise distributed neurons and weighted links

(Fig 7.1). Arranged in an input-hidden-output layered structure,

eachneuron contains a simple processing function (i.e., activation

Urban Remote Sensing: Monitoring, Synthesis and Modeling in the Urban Environment

Search WWH ::

Custom Search

Home