Early History—Robots, Thought,Creativity, Learning and Translation - Robots Unlimited: Life in a Virtual Age

Robotics Reference

In-Depth Information

program was designed to learn a successful strategy for playing a coin-

tossing game that had been suggested by Shannon. In this game one

computer would choose heads or tails and another computer would try

to guess which. The guessing computer attempts to detect a pattern in

the choosing computer's choices, while the choosing computer attempts

to detect a pattern in the guessing computer's guesses. The game is made

more complicated by the fact that each of the computers will attempt to

change its pattern sufficiently often to flummox its opponent.

Kirsch devised a learning mechanism that seemed to him like a model

of the way in which an animal learns. His animal model was then used

to play the coin-matching game against a choosing program. Evidently

a learning mechanism must respond to some sort of stimulus, which,

in the case of the coin tossing game, was defined as follows. At each

matching of the coin two numbers were generated, one corresponding

to the chooser's move and one corresponding to the animal's guess at the

opponent's move. Since there are only two possibilities, heads (H) and

tails (T), binary numbers were used (1s and 0s).

Kirsch's “animal” learning program was designed to react to the past

four pairs of moves. A pair of moves provides four possible combinations

(HH, HT, TT, TH), and four pairs of moves provides 4

×

4

=

256

possible combinations (called stimuli).

The learning mechanism consisted of the animal becoming condi-

tioned to a certain move by the choosing opponent after each of the 256

possible stimuli. For each stimulus, the animal noted what move the

opponent made next. Then, the next time that same stimulus occurred,

i.e., the same sequence of four pairs of moves, the animal duplicated

the move of the opponent that followed that same stimulus on the pre-

vious occasion it occurred. The more the choosing opponent repeated

the same move after any given stimulus, the more the animal program

became “conditioned” to that move.

For each of the 256 possible stimuli the animal program stored a

number, called the conditioning number, which varied between

1and

+

−

1. Whenever the animal scored a success in predicting the opponent's

choice, the number in the animal program's storage location for the ap-

propriate stimulus was increased. 15

Thus, if the opponent always fol-

15 The size of this increase was a fixed fraction of the difference between its previous value and

+

/ 2 and if the value associated with a particular stimulus was -1, then

the next increase to the conditioning number would be 1

1. Thus, if this fraction were 1

/

2

×

2, since the difference between

−

1

(its previous value) and

+

1 is 2. So the increase would be 1 and the new value would therefore be

Robots Unlimited: Life in a Virtual Age

Search WWH ::

Custom Search

Home