Information Technology Reference
In-Depth Information
In July 2007, Ferrucci and some IBM colleagues flew down to the Sony
Pictures Studios in Culver City, California, to meet with Harry Friedman, pro-
ducer of Jeopardy! The result was a provisional go-ahead for a human-machine
match in late 2010 or early 2011. Friedman had also agreed that clues with
audio or visual clips would not be used. The IBM team now had a deadline to
work toward, and they began the process of “educating” Watson. They had
access to twenty years of Jeopardy! clues from a fan website called J!Archive.
From an analysis of twenty thousand clues, the team determined how often
particular categories turned up. They could also study individual games, and
they analyzed the seventy-four winning games of Jennings's to understand his
strategy. In their “War Room” at IBM Research in Hawthorne, New York, the
team plotted this information on a chart they called the “Jennings arc.” He
averaged more than 90 percent correct answers, and in one game Jennings won
the buzzer on 75 percent of the clues. They calculated that, to beat Jennings,
Watson would need to match his precision and win the race to the buzzer at
least 50 percent of the time.
One of the early conclusions was that Watson did not need to know litera-
ture, music, and TV in great depth to answer the Jeopardy! clues. Instead, it needed
to know the major facts about famous novels, brief biographies of major compos-
ers, and the stars and plotlines of popular TV shows. However, because it could
not search the Web during the match, all of this information had to be loaded
into Watson's memory from sources such as Wikipedia, encyclopedias, dictionar-
ies, and newspaper articles, all in a form that the machine could understand.
The biggest obstacle for the researchers was teaching the machine to
“understand” what it was supposed to look for from the cryptic Jeopardy! clues,
which were often worded in a puzzling manner. The first algorithm to be
applied was a grammatical analysis identifying nouns, verbs, adjectives, and
pronouns. However, there were many possible key words that could be rele-
vant to finding the answer, and Ferrucci and his team had to search through all
the many different interpretations. Then, by using a variety of machine-learn-
ing methods and cross-checks, they assigned probabilities to a list of possible
answers. All of these searches and tests took vital time, and in the game they
had to come up with an answer in just a few seconds. Toward the end of 2008,
Ferrucci recruited a five-person hardware team to devise a way to speed up the
processing time more than a thousand-fold. How was this to be achieved? The
answer was to distribute the calculations over more than two thousand proces-
sors so that Watson could explore all these paths simultaneously.
During the buildup to the contest, Watson moved on from training on sets
of Jeopardy! clues to practice matches with previous Jeopardy! winners. By May
2010, Watson won against human players 65 percent of the time. The team
used Watson's failures to improve and tune up their algorithms and selection
criteria. They also had to insert a “profanity filter” to help Watson distinguish
between polite language and profanity. After numerous glitches and many
amusing mistakes, Watson's performance had climbed up the Jennings arc so
that it approached the performance of experienced Jeopardy! winners. However,
the televised match was to pit Watson against two of the very best Jeopardy!
champions, Jennings and Brad Rutter, who had beaten Jennings in the show's
“Ultimate Tournament of Champions” competition in 2005.
Search WWH ::




Custom Search