Information Technology Reference
In-Depth Information
database conceivably encompassing, say, all of the knowledge on the World Wide
Web, then how are we to realistically evaluate the system's performance?
In light of these complications, we propose the following alterations of TCA 1 .If
the word 'good' is replaced with 'useful,' then we connote an evaluation method that
is not based in the particular metrics as defined by the analogical theory itself (which
can be controversial), but rather based in an intuitive notion that can and should
be evaluated independently of the model. In other words, researchers might dis-
agree on how to measure an analogical match, but whether the resulting analogically
inferred hypothesis is useful can be evaluated without any knowledge of the analogi-
cal matcher used. Of course, since what is 'useful' can be very domain-dependent, we
do not claim that this word is completely unambiguous. Later in this chapter, a better
replacement for the word 'useful' will be suggested. For now, the aim behind this
move is to divorce the metric which determines the quality of an analogical match's
results (which may be very domain-dependent) from the theory-specific metric that
the matcher is specifically designed to optimize. That (deceptively) small change
gives us TCA 2 :
TCA 2 A computational system of analogy answers the TC if, given no more than a pre-
existing database and an unparsed input text, it is able to consistently produce useful analogies
across many domains.
5.2.1.2 What Are Acceptable Databases?
In TCA 2 , it is clear that the knowledge available in the database used is a limiting
factor in how informative the inferences produced by the analogical system can
be. The suggestion phrased by Gentner and Forbus [ 24 ] as “pre-existing databases”
requires more clarification. The implication (at least as we interpret it) is that the
dataset and the structures within cannot have been constructed for the purpose of
solving the particular toy examples that are of interest. Otherwise this introduces
bias and tailorability concerns, in spite of the best intentions of the designers. Two
issues immediately come tomind. First, what is the proper level of separation between
the database and the analogical system? Secondly, how large does the database have
to be?
The first question is at least partially answered by considering the level of repre-
sentational agreement between the database and the analogical system. For example,
if the database is a purely distributed one with no localist concepts whatsoever (which
is, we acknowledge, an unlikely possibility), and the analogical system is one that
uses only localist, explicitly structured data, then a significant level of work will be
needed to first extract the information from the database and put it into the form
that the analogical reasoner requires (this can be considered a re-representation step
[ 15 , 30 , 44 ]). The choice of database becomes important for this reason, and if no
database exists that does not require a high level of re-representation, then it suggests
a problem: Consider that although proponents of localist, distributed, and hybrid rep-
resentation styles make claims all the time about the scalability of their assumptions
of knowledge representation, the researchers who have to design and work with large
Search WWH ::




Custom Search