Information Technology Reference
In-Depth Information
the successful Google technology, uses the vast
link structure of the Web as a valuable source of
ranking web pages.
The user group of current information retrieval
systems has been exploding and becoming more
and more varied. It is much easier for any person to
access a huge amount of information from various
sources. The vast number of users is not a challenge
for searching systems with the development of fast
processor and cheap storage space. However, as
a result of large number of non-expert searchers,
the poor quality of queries remains a major chal-
lenge for most retrieval systems. Though baring
the quality problem, the vast size of accumulated
query pool, provides possible candidate contexts,
which, while properly indexed, can be used to
facilitate later searches.
The vague query problem is not new. It has
been proved by many researches that most web
queries are short—2 to 3 terms, and most search
sessions include little query modification and
are generally 2-3 queries in length (Croft and
Thompson, 1987; Spink, 1997). As summarized
by Barouni-Ebrahimi and Ghorbani (2008),
researchers work in three directions to help to
improve query quality:
The suggested terms may come from previ-
ous search sessions' top ranked documents
(Fitzpatrick and Dent, 1997), or from previ-
ous users' modification of their queries (Cui
et al., 2002), or from a knowledge base (Liu
and Chu, 2007).
(3) Query Completion refers to the method that
while a user is typing a query, the system
will automatically suggest some frequent
words for the last incomplete word in the
query. Google's suggestion service is an
example of query completion. White and
Marchionini (2007) prove the importance
of such function.
All three mechanisms need a similarity measure
to link a new query with saved query session.
The similarity measure can be divided into two
groups: the traditional IR bag-of word measure,
which looks at the term appearance and frequency
in the saved queries and retrieved/clicked results;
and the behavior based measure, which uses the
sequence of previous searches as a indicator to
identify the importance of some terms or que-
ries. A query context, or Quest, as we proposed,
should go beyond topicality of the old queries
and documents.
The third part of a standard information re-
trieval model is information collection. As we
have discussed above, part of the collection (re-
trieved/clicked previously) is an important source
in representing a Quest. With the development
of free Web, especially the recent development
of social networks, the formats and genres of
information have been enriched considerably.
Both business and end users have noticed the
importance of this new type of information. As
a result, the new challenge to IR systems is not
only about how to represent this new collection,
but also to verify the possibility of using this
user-contributed information to enrich the Quest
representation, and how.
(1) Query Recommendation will provide a list
of old queries (or search sessions) that are
ranked by their similarities to the new sub-
mitted query (Raghavan and Sever 1995;
Kantor, et al., 1999; Glance, 2001; Baeza-
Yates, Hurtado, and Mendoza, 2004; Zhang
and Nasraoui, 2006). Generally, an old query
and corresponding information with it (result
pages, followed links, etc.) are used to cal-
culate the similarities between the submitted
query and the saved search sessions. In other
words, the similarity is based on the text of
the query and the retrieved and/or clicked
pages.
(2) Query expansion is a method to add some
suggested terms or phrases to a new submit-
ted query to solve the short query problem.
Search WWH ::




Custom Search