Collaborative Retrieval Systems - Collaborative Technologies and Applications for Interactive Information Design

Information Technology Reference

In-Depth Information

the successful Google technology, uses the vast

link structure of the Web as a valuable source of

ranking web pages.

The user group of current information retrieval

systems has been exploding and becoming more

and more varied. It is much easier for any person to

access a huge amount of information from various

sources. The vast number of users is not a challenge

for searching systems with the development of fast

processor and cheap storage space. However, as

a result of large number of non-expert searchers,

the poor quality of queries remains a major chal-

lenge for most retrieval systems. Though baring

the quality problem, the vast size of accumulated

query pool, provides possible candidate contexts,

which, while properly indexed, can be used to

facilitate later searches.

The vague query problem is not new. It has

been proved by many researches that most web

queries are short—2 to 3 terms, and most search

sessions include little query modification and

are generally 2-3 queries in length (Croft and

Thompson, 1987; Spink, 1997). As summarized

by Barouni-Ebrahimi and Ghorbani (2008),

researchers work in three directions to help to

improve query quality:

The suggested terms may come from previ-

ous search sessions' top ranked documents

(Fitzpatrick and Dent, 1997), or from previ-

ous users' modification of their queries (Cui

et al., 2002), or from a knowledge base (Liu

and Chu, 2007).

(3) Query Completion refers to the method that

while a user is typing a query, the system

will automatically suggest some frequent

words for the last incomplete word in the

query. Google's suggestion service is an

example of query completion. White and

Marchionini (2007) prove the importance

of such function.

All three mechanisms need a similarity measure

to link a new query with saved query session.

The similarity measure can be divided into two

groups: the traditional IR bag-of word measure,

which looks at the term appearance and frequency

in the saved queries and retrieved/clicked results;

and the behavior based measure, which uses the

sequence of previous searches as a indicator to

identify the importance of some terms or que-

ries. A query context, or Quest, as we proposed,

should go beyond topicality of the old queries

and documents.

The third part of a standard information re-

trieval model is information collection. As we

have discussed above, part of the collection (re-

trieved/clicked previously) is an important source

in representing a Quest. With the development

of free Web, especially the recent development

of social networks, the formats and genres of

information have been enriched considerably.

Both business and end users have noticed the

importance of this new type of information. As

a result, the new challenge to IR systems is not

only about how to represent this new collection,

but also to verify the possibility of using this

user-contributed information to enrich the Quest

representation, and how.

(1) Query Recommendation will provide a list

of old queries (or search sessions) that are

ranked by their similarities to the new sub-

mitted query (Raghavan and Sever 1995;

Kantor, et al., 1999; Glance, 2001; Baeza-

Yates, Hurtado, and Mendoza, 2004; Zhang

and Nasraoui, 2006). Generally, an old query

and corresponding information with it (result

pages, followed links, etc.) are used to cal-

culate the similarities between the submitted

query and the saved search sessions. In other

words, the similarity is based on the text of

the query and the retrieved and/or clicked

pages.

(2) Query expansion is a method to add some

suggested terms or phrases to a new submit-

ted query to solve the short query problem.

Collaborative Technologies and Applications for Interactive Information Design

Search WWH ::

Custom Search

Home