Database Reference
In-Depth Information
9.1.3 Limitations of Current Solutions
Despite the substantial accomplishments in both AF and TDT, significant
problems remain unsolved regarding how to optimize utility of the system in
terms of the relevance and novelty of returned documents for users attention,
and how to make user feedback most effective and least costly. The following
issues, specifically, might seriously limit the true utility of an AF or ND system
in real-world applications:
Users have a 'passive' role. That is, he or she reacts to the system only
if the system makes a 'yes' decision on a document, by confirming or
rejecting the system decision. A more active alternative would be to
allow the user to review a ranked list of system-selected candidates each
time, making human judgments more effective in discriminating hard
cases between true positives and false alarms for profile adaptation. To
support this, modeling the uncertainty of a ranked document being read
by the user becomes an issue (for which little research has been done in
AF and ND) because we can no longer assume a deterministic process
for user relevance feedback.
The unit for receiving user relevance judgments has been restricted
to a document in conventional AF and ND. However, a real user
may be willing to provide more informative, fine-grained feedback via
highlighting some smaller pieces of text as relevant and/or novel. To
support such interaction, the system may provide passage ranking based
on relevance where passage length may vary (as documents, paragraphs,
sentences or n-consecutive word windows), depending on applications,
datasets and user preferences. Further, the system needs to learn from
labeled pieces of text of arbitrary span instead of just allowing labeled
documents. How to train, optimize and evaluate such a system is an
open challenge.
System-selected documents are often highly redundant. A major news
event, for example, would be reported by multiple sources repeatedly
for a while, making most of the information content in those articles
redundant with each other. A relevance-driven AF system would select
all these redundant documents for user feedback, wasting the user's
time while offering little gain. Clearly, novelty detection (ND) and anti-
redundancy ranking of documents or passages would help in principle.
However, how to leverage both relevance and novelty assessments for
unified utility optimization and for effective user interactions with the
system is a main challenge in information distillation.
In the rest of the chapter, we present our recent work in utility-based
information distillation, addressing the above limitations and challenges (27).
Specifically, with a new distillation system called CAFE ,CMUAdaptive
Filtering Engine, we define a task-oriented distillation process, analyze
Search WWH ::




Custom Search