Information Technology Reference
In-Depth Information
( c i ,c i +1 ,dist i ( i +1) ) , ( c i ,c i +2 ,dist i ( i +2) ) , etc. This is to be done for each chunk list
computed for each web snippet. The distance dist ij of two chunks c i and c j is com-
puted directly from the chunk list, i.e. we do not count the position of ignored words
lying between two chunks.
Finally, we compute the chunk-pair-distance model CPD M using the frequencies
of each chunk, each chunk pair, and each chunk pair distance. CPD M is used for
constructing the topic graph in the final step. Formally, a topic graph TG =( V,E,A )
consists of a set V of nodes, a set E of edges, and a set A of node actions. Each node v
V represents a chunk and is labeled with the corresponding PoS-tagged word group.
Node actions are used to trigger additional processing, e.g. displaying the snippets,
expanding the graph etc.
The nodes and edges are computed from the chunk-pair-distance elements. Since
the number of these elements is quite large (up to several thousands), the elements are
ranked according to a weighting scheme which takes into account the frequency infor-
mation of the chunks and their collocations. More precisely, the weight of a chunk-pair-
distance element cpd = ( c i ,c j ,D ij ) , with D ij =
{
( freq 1 ,dist 1 ) , ( freq 2 ,dist 2 ) , ...,
( freq n ,dist n )
}
, is computed based on point-wise mutual information (PMI, cf. [15])
as follows:
PMI ( cpd )= log 2 (( p ( c i ,c j ) / ( p ( c i )
p ( c j )))
∗ p ( c j ))
where relative frequency is used for approximating the probabilities p ( c i ) and p ( c j ) .
For log 2 ( p ( c i ,c j )) we took the (unsigned) polynomials of the corresponding Taylor
series using ( freq k ,dist k ) in the k-th Taylor polynomial and adding them up:
= log 2 ( p ( c i ,c j ))
− log 2 ( p ( c i )
n
( x k ) k
k
PMI ( cpd )=(
)
log 2 ( p ( c i )
p ( c j ))
k =1
,wherex k = freq k
k =1 freq k
The visualized part of the topic graph is then computed from a subset of CPD M using
the m highest ranked chunk-pair-distance elements for fixed c i .Inotherwords,we
restrict the complexity of a topic graph by restricting the number of edges connected to
a node.
3
Touchable Interface for Mobile Devices
Today, it is a standard approach to optimize the presentation of a web page, depending
on the device it is displayed on, e.g., a standard or mobile web browser. Obviously, the
same should hold for graphical user interfaces, and in our case, for the user interfaces
designed for iPad and iPhone.
More concretely, the usage of a different mode of presentation and interaction with a
topic graph depending on the device at hand, is motivated for the following reasons: For
a smartphone the capabilities for displaying touchable text and graphics on one screen
are limited mainly due to its relatively small screen size. Our concept for presenting
 
Search WWH ::




Custom Search