Databases Reference
In-Depth Information
easy-to-use customer self-help system. The easier it is to navigate the system
and to find and extract information, the greater the benefit to the company.
An information retrieval (IR) system serving as a customer self-help (fre-
quently asked questions or FAQ) system is a key component of the customer
service product offered by RightNow Technologies, Inc., a leading provider of
on-demand software to help companies manage their customer interactions.
The RightNow self-help system and similar products from other vendors are
becoming ubiquitous on support sites throughout the Web. A diverse group of
manufacturers, retailers, and service providers, including the Social Security
Administration, Nikon, Dell, Qwest, and Florida State University, utilize FAQ
systems as integral components of their customer support offerings.
In addition to searching the FAQ repository for relevant documents us-
ing keyword searches or natural language queries (via a standard IR system),
product and category associations, and other filtering techniques, the Right-
Now FAQ system also includes an unsupervised machine learning algorithm
that clusters the documents in the system, grouping documents containing
similar sets of terms and phrases, so that users can browse the document col-
lection in an organized fashion without necessarily having an exact question
clearly specified.
RightNow Technologies' research shows that a significant enhancement is
to cluster users' search phrases, with a similar goal. The query taxonomy
created by this process can help demonstrate to users the types of queries
that can be effectively answered by the FAQ system and can illustrate the
progression of detail from general concepts to more specific information. The
hierarchy can also help the system's user interface to adapt to typical use,
indicating not only the contents of the system but how users are attempting
to retrieve information. Members of the clusters can potentially serve as a
source of additional search terms to help focus user queries and retrieve smaller
sets of more relevant documents. Also, system administrators can examine
the topic or concept hierarchy to evaluate the quality and usefulness of the
documents contained in the system. If queries that are deemed too dissimilar
are clustered together, additional documents more specific to the questions
being asked should be added. (This process of analyzing the repository to
find content shortcomings is known as gap analysis [1].)
Note that this clustering of user queries is not intended to replace the
clustering of FAQs; the documents in the FAQ repository are still clustered
according to their content. The user query hierarchy should augment this
information, describing not what is contained within the repository but what
users are expecting to find there and how their questions get related by the
FAQ content and search process. Also, this technique is not limited to FAQ
systems. Any information retrieval system in which user queries are used to
return a subset of a document repository could benefit from this technique.
This paper describes the HAC + P + FSR (for HAC clustering plus par-
titioning plus feature selection and reduction) algorithm, which is an exten-
sion to a previously designed algorithm, HAC+P. It also presents a similar
Search WWH ::




Custom Search