Information Technology Reference
In-Depth Information
1. We consider a user query as a specification of a topic that the user wants to know
and learn more about. Hence, the search result is basically a graphical structure of
that topic and associated topics that are found.
2. The user can interactively explore this topic graph using a simple and intuitive user
interface in order to either learn more about the content of a topic or to interactively
expand a topic with newly computed related topics.
3. Nowadays, the mobile web and mobile touchable devices, like smartphones and
tablet computers, are getting more and more prominent and widespread. Thus the
user might expect a device-adaptable touchable handy human-computer interac-
tion.
In this paper, we present an approach of exploratory web search, that tackles the above
mentioned requirements in the following way.
In a first step, the topic graph is computed on the fly from a set of web snippets that
has been collected by a standard search engine using the initial user query. Rather than
considering each snippet in isolation, all snippets are collected into one document from
which the topic graph is computed. We consider each topic as an entity, and the edges
are considered as a kind of (hidden) relationship between the connected topics. The
content of a topic are the set of snippets it has been extracted from, and the documents
retrievable via the snippets' web links.
The topic graph is then displayed either on a tablet computer (in our case an iPad)
as touch-sensitive graph or displayed as a stack of touchable text on a smartphone (in
our case an iPhone or an iPod touch). By just selecting a node or a text box, the user
can either inspect the content of a topic (i.e, the snippets or web pages) or activate
the expansion of the topic graph through an on the fly computation of new related
topics for the selected node. The user can request information from new topics on basis
of previously extracted information by selecting a node from a newly extracted topic
graph.
In such a dynamic open-domain information extraction situation, the user expects
real-time performance from the underlying technology. The requested information can-
not simply be pre-computed, but rather has to be determined in an unsupervised and
on-demand manner relative to the current user request. This is why we assume that the
relevant information can be extracted from a search engine's web snippets directly, and
that we can avoid the costly retrieval and processing time for huge amounts of docu-
ments. Of course, direct processing of web snippets also poses certain challenges for
the Natural Language Processing (NLP) components. Web snippets are usually small
text summaries which are automatically created from parts of the source documents and
are often only in part linguistically well-formed, cf. [9]. Thus the NLP components are
required to possess a high degree of robustness and run-time behavior to process the
web snippets in real-time. Since our approach should also be able to process web snip-
pets from different languages (our current application runs for English and German),
the NLP components should be easily adaptable to many languages. Finally, no restric-
tions to the domain of the topic should be pre-supposed, i.e., the system should be able
to accept topic queries from arbitrary domains. In order to fulfill all these requirements,
we are favoring and exploring the use of shallow and highly data-oriented NLP com-
ponents. Note that this is not a trivial or obvious design decision, since most of the
 
Search WWH ::




Custom Search