Databases Reference
In-Depth Information
E
This mapper produces instances of the branching node tuples in
by replacing
V
the'*'inthe3rdplacewithavaluein
. Between the key-value pairs obtained
and emitted in this way are:
key = (Person4,Article1,Person1), value = (Q1,
<
*,”Title1”
>
)
key = (Person2,Article2,Person3), value = (Q1,
<
*,”Title2”
>
)
The input of the
Mapper with key Q2
is:
E=
{
(
<
*,Article1,Person1
>
,
<
Journal1,*
>
),
(
<
*,Article2,Person3
>
,
<
Journal1,*
>
), ...
}
V=
{
(1,Person2), (1,Person4), ...
}
Some of the instances that this mapper produces and emitted are:
key = (Person4,Article1,Person1), value = (Q2,
<
Journal1,*
>
)
key = (Person2,Article2,Person3), value = (Q2,
<
Journal1,*
>
)
The input of the
Mapper with key Q3
is:
E=
{
(
<
Person4,*,Person1
>
,
<
*,*
>
), (
<
Person2,*,Person3
>
,
<
*,*
>
)
}
V=
{
(2,Article1), (2,Article2)
}
Some key-value pairs produced and emitted (as above) by this mapper are:
key = (Person4,Article1,Person1), value = (Q3,
<
*,*
>
)
key = (Person2,Article2,Person3), value = (Q3,
<
*,*
>
)
Reducer of Phase 2.
In each reducer, the embeddings (one for each subquery
in (
Q
1
,...,Q
n
)) are joined
3
to construct the final answers of
Q
:
reducer2
(key, values)
// key: a tuple of branching node values
// values: pairs of the form (
Q
i
,
partial embedding for non-branching nodes
)
begin
-
for each
join obtained by using one embedding for each subquery
do
- Emit the result produced by this join
end.
Example 10.
(Continued from Example 9). The
Reducer with key (Person4, Ar-
ticle1, Person1)
receives the set
{
(Q1,
<
*,”Title1”
>
), (Q2,
<
Journal1,*
>
), (Q3,
<
*,*
>
)
}
, joins these embeddings and returns the answer:
<
Person4,Article1,Person1,Journal1,”Title1”
>
The
reducer with key (Person2,Article2,Person3)
receives the set
{
(Q1,
<
*,
”Title2”
>
), (Q2,
<
Journal1,*
>
), (Q3,
<
*,*
>
)
}
which joins giving the answer:
<
Person2,Article2,Person3,Journal1,”Title2”
>
Notice that no other reducer returns solution (as they do not receive embed-
dings for all subqueries).
5.5 Implementation of the Algorithm
An experimental implementation of our algorithm has been developed using
Hadoop 1.0.4. For our experiments we have used a cluster of 14 nodes of the fol-
lowing characteristics: Intel Pentium(R) Dual-Core CPU E5700 3.00GHz with
3
Notice that the joined embeddings are, by construction, compatible.
Search WWH ::
Custom Search