Databases Reference
In-Depth Information
principle (Lam [Lam98]). Cooper [Coo90] showed that the general problem of infer-
ence in unconstrained belief networks is NP-hard. Limitations of belief networks, such
as their large computational complexity (Laskey and Mahoney [LM97]), have prompted
the exploration of hierarchical and composable Bayesian models (Pfeffer, Koller, Milch,
and Takusagawa [PKMT99] and Xiang, Olesen, and Jensen [XOJ00]). These follow an
object-oriented approach to knowledge representation. Fishelson and Geiger [FG02]
present a Bayesian network for genetic linkage analysis.
The perceptron is a simple neural network, proposed in 1958 by Rosenblatt [Ros58],
which became a landmark in early machine learning history. Its input units are ran-
domly connected to a single layer of output linear threshold units. In 1969, Minsky
and Papert [MP69] showed that perceptrons are incapable of learning concepts that
are linearly inseparable. This limitation, as well as limitations on hardware at the time,
dampened enthusiasm for research in computational neuronal modeling for nearly
20 years. Renewed interest was sparked following the presentation of the backpropaga-
tion algorithm in 1986 by Rumelhart, Hinton, and Williams [RHW86], as this algorithm
can learn concepts that are linearly inseparable.
Since then, many variations of backpropagation have been proposed, involving, for
example, alternative error functions (Hanson and Burr [HB87]); dynamic adjustment
of the network topology (M ezard and Nadal [MN89]; Fahlman and Lebiere [FL90]; Le
Cun, Denker, and Solla [LDS90]; and Harp, Samad, and Guha [HSG90]); and dynamic
adjustment of the learning rate and momentum parameters (Jacobs [Jac88]). Other
variations are discussed in Chauvin and Rumelhart [CR95]. Books on neural networks
include Rumelhart and McClelland [RM86]; Hecht-Nielsen [HN90]; Hertz, Krogh, and
Palmer [HKP91]; Chauvin and Rumelhart [CR95]; Bishop [Bis95]; Ripley [Rip96]; and
Haykin [Hay99]. Many topics on machine learning, such as Mitchell [Mit97] and Russell
and Norvig [RN95], also contain good explanations of the backpropagation algorithm.
There are several techniques for extracting rules from neural networks, such as those
found in these papers: [SN88, Gal93, TS93, Avn95, LSL95, CS96, LGT97]. The method
of rule extraction described in Section 9.2.4 is based on Lu, Setiono, and Liu [LSL95].
Critiques of techniques for rule extraction from neural networks can be found in Craven
and Shavlik [CS97]. Roy [Roy00] proposes that the theoretical foundations of neural
networks are flawed with respect to assumptions made regarding how connectionist
learning models the brain. An extensive survey of applications of neural networks in
industry, business, and science is provided in Widrow, Rumelhart, and Lehr [WRL94].
Support Vector Machines (SVMs) grew out of early work by Vapnik and
Chervonenkis on statistical learning theory [VC71]. The first paper on SVMs was
presented by Boser, Guyon, and Vapnik [BGV92]. More detailed accounts can be
found in topics by Vapnik [Vap95, Vap98]. Good starting points include the tuto-
rial on SVMs by Burges [Bur98], as well as textbook coverage by Haykin [Hay08],
Kecman [Kec01], and Cristianini and Shawe-Taylor [CS-T00]. For methods for solving
optimization problems, see Fletcher [Fle87] and Nocedal and Wright [NW99]. These
references give additional details alluded to as “fancy math tricks” in our text, such
as transformation of the problem to a Lagrangian formulation and subsequent solving
using Karush-Kuhn-Tucker (KKT) conditions.
 
Search WWH ::




Custom Search