Database Reference
In-Depth Information
At the data level, however, there are some special issues to deal with. Let c ( D )
¼
{ r 1 ,
, r m }, where each r i is a rowid (1
i
m ). Assume that the user attached
...
some content to a row r
c ( D ). In fact, the user may have attached several pieces of
content u 1 ,
, u s . However, the user attached content u 1 to r when it was part of
the answer of another command c 0 , and so it was part of the set c 0 ( D )
...
c ( D ),
content u 2 when it was part of the answer to another command c 00 ...
Should all this
content still be displayed? Some of it? One could argue that such content should
be displayed in context, so the content associated with r through c 0 ( D ) should
be displayed only if c ( D ) is somewhat related to c 0 ( D ). For a given rowid r
p rowid Refs , we define the annotation contexts of r as follows:
AC
ð
r
Þ¼f
t
:
refid
j
t
:
rowid
¼
r
g
(that is, the set of reference ids containing r ). For each annotation context ac
AC ( r ),
its extension is simply the set of all rowids attached to it:
Ext
ð
ac
Þ¼f
t
:
rowid
j
t
:
refid
¼
ac
g
Then, for each row r j
c ( D ), we display the user-created content attached to r j in
context ac
AC ( r j )if Ext ( ac ) is sufficiently related to c ( D ). What “sufficiently
related” is can be defined using several semantic measures. An analytic measure
can be defined along the lines of typical metrics like Jaccard, since both Ext ( ac ) and
c ( D ) are sets:
Þ¼ j
Ext
ð
ac
Þ\
c
ð
D
Þj
dist
ð
ac
;
c
Þj a
j
Ext
ð
ac
Þ[
c
ð
D
where
is a threshold. The advantages of this well-known metric is that it closely
matches intuition in extreme cases (i.e., it reaches its maximum value of 1 when
c ( D )
a
Ext ( ac )or Ext ( ac )
c ( D )), and its minimum of 0 when Ext
ð
ac
Þ\
c
ð
D
Þ¼;
).
Þ¼ S r2cðDÞ
Thus, for c ( D )
,thisrepre-
sents contexts that are common to all tuples in c ( D ). Then we can choose ac
¼
{ r 1 ,
, r m }, let Ac
ð
c
Ac
ð
r
Þ
.If Ac
ð
c
Þ 6¼;
...
Þ
such that min r c ( D ) dist ( ac , r ), that is, the context that is the closest to some tuple
in c ( D ). Other measures are also possible: the context that minimizes overall distance
( min
2
Ac
ð
c
Þ¼;
(there is no context that is common to all tuples) we can choose, for each tuple r
S r c ( D ) dist ( ac , r )) or average distance ( minAvg r c ( D ) dist ( ac , r )). If Ac
ð
c
c
( D ), the display of the context ac that minimizes the distance to c ( D ): Ext ( ac ):
min ac AC ( r ) dist ( ac , c )).
Once it is decided which user-created content to show, the same technique
outlined above (outerjoin based on rowids) can be used.
Note that carrying out the procedure just outlined can be quite costly: we need
to compute, for all r
AC ( r ), we have to obtain
Ext ( ac ), and finally we need to determine a metric between Ext ( ac ) and c ( D )
(the one proposed above or an alternate one). It is easy to see that in a worst-case
c ( D ), AC ( r ); then, for each ac
Search WWH ::




Custom Search