Database Reference
In-Depth Information
p=0.0
p=0.1
Indri
[Base]
Indri
[PRF]
CAFE
[Base]
CAFE
[F]
CAFE
[F+N]
CAFE
[F+A]
CAFE
[F+N+A]
Systems
FIGURE 9.1 : PNDCU Scores of Indri and CAFE for two dampening
factors ( p ), and various settings (PRF: Pseudo Relevance Feedback,
F: Feedback, N: Novelty Detection, A: Anti-Redundant Ranking).
set containing 6 tasks each. Each system was allowed to use the training set
to tune its parameters for optimizing PNDCU (equation 9.7), including the
ranked list length for both Indri and our own system, and the novelty and
anti-redundancy thresholds for our system.
The PNDCU for each system run is calculated automatically. User feedback
was also simulated: relevance judgments for each system-produced passage (as
determined by the nugget matching rules described in section 9.4.1.2) were
used as user feedback in the adaptation of query profiles and user histories.
9.6.3 Results
In Figure 9.1, we show the PNDCU scores of the two systems under various
settings. These scores are averaged over all chunks of the six tasks in the
test set, and are calculated with two dampening factors (see Section 9.4.2.1 ):
β =0and0 . 1, to simulate no tolerance and small tolerance for redundancy,
respectively.
Allowing user feedback in our system improves the utility substantially
when the user is willing to allow some redundancy ( β =0 . 1), whereas the
improvement is smaller when no redundancy is allowed ( β = 0). This is not
surprising - when the user gives positive feedback on an item, the system
favors that item in its query model and tends to show it repeatedly in the
future. It is informative to evaluate such systems using our utility measure
(with p = 0) which accounts for novelty and thus gives a more realistic picture
Search WWH ::




Custom Search