Database Reference
In-Depth Information
At the data level, however, there are some special issues to deal with. Let
c
(
D
)
¼
{
r
1
,
,
r
m
}, where each
r
i
is a rowid (1
i
m
). Assume that the user attached
...
some content to a row
r
c
(
D
). In fact, the user may have attached several pieces of
∈
content
u
1
,
,
u
s
. However, the user attached content
u
1
to
r
when it was part of
the answer of another command
c
0
, and so it was part of the set
c
0
(
D
)
...
6¼
c
(
D
),
content
u
2
when it was part of the answer to another command
c
00
...
Should all this
content still be displayed? Some of it? One could argue that such content should
be displayed in context, so the content associated with
r
through
c
0
(
D
) should
be displayed only if
c
(
D
) is somewhat related to
c
0
(
D
). For a given rowid
r
∈
p
rowid
Refs
, we define the
annotation contexts
of
r
as follows:
AC
ð
r
Þ¼f
t
:
refid
j
t
:
rowid
¼
r
g
(that is, the set of reference ids containing
r
). For each annotation context
ac
AC
(
r
),
∈
its
extension
is simply the set of all rowids attached to it:
Ext
ð
ac
Þ¼f
t
:
rowid
j
t
:
refid
¼
ac
g
Then, for each row
r
j
∈
c
(
D
), we display the user-created content attached to
r
j
in
context
ac
AC
(
r
j
)if
Ext
(
ac
) is sufficiently related to
c
(
D
). What “sufficiently
related” is can be defined using several semantic measures. An analytic measure
can be defined along the lines of typical metrics like Jaccard, since both
Ext
(
ac
) and
c
(
D
) are sets:
∈
Þ¼
j
Ext
ð
ac
Þ\
c
ð
D
Þj
dist
ð
ac
;
c
Þj
a
j
Ext
ð
ac
Þ[
c
ð
D
where
is a threshold. The advantages of this well-known metric is that it closely
matches intuition in extreme cases (i.e., it reaches its maximum value of 1 when
c
(
D
)
a
Ext
(
ac
)or
Ext
(
ac
)
c
(
D
)), and its minimum of 0 when
Ext
ð
ac
Þ\
c
ð
D
Þ¼;
).
Þ¼
S
r2cðDÞ
Thus, for
c
(
D
)
,thisrepre-
sents contexts that are common to all tuples in
c
(
D
). Then we can choose
ac
¼
{
r
1
,
,
r
m
}, let
Ac
ð
c
Ac
ð
r
Þ
.If
Ac
ð
c
Þ 6¼;
...
Þ
such that
min
r
∈
c
(
D
)
dist
(
ac
,
r
), that is, the context that is the closest to some tuple
in
c
(
D
). Other measures are also possible: the context that minimizes overall distance
(
min
2
Ac
ð
c
Þ¼;
(there is no context that is common to all tuples) we can choose, for each tuple
r
S
r
∈
c
(
D
)
dist
(
ac
,
r
)) or average distance (
minAvg
r
∈
c
(
D
)
dist
(
ac
,
r
)). If
Ac
ð
c
c
(
D
), the display of the context
ac
that minimizes the distance to
c
(
D
):
Ext
(
ac
):
min
ac
∈
AC
(
r
)
dist
(
ac
,
c
)).
Once it is decided which user-created content to show, the same technique
outlined above (outerjoin based on rowids) can be used.
Note that carrying out the procedure just outlined can be quite costly: we need
to compute, for all
r
∈
AC
(
r
), we have to obtain
Ext
(
ac
), and finally we need to determine a metric between
Ext
(
ac
) and
c
(
D
)
(the one proposed above or an alternate one). It is easy to see that in a worst-case
c
(
D
),
AC
(
r
); then, for each
ac
∈
∈
Search WWH ::
Custom Search