Database Reference
In-Depth Information
Input:
Path productivity table
τ
p
[][1,...,k]forplanQ,
refresh rates R
S
1
,...,
j
∀
j states, available resources
μ
Output:
Assignment of
to plan Q
PathAssign
[][]
1:
PathAssign
[][]
←
GH-WMSCP(
τ
p
[ ][1,...,k],
R
S
1
,...,
J
)
2:
Δμ←
i
−>
1
PathAssign
[
i
][ ]
3:
PathAssign
[][]
←
GrePP(
μ−Δμ
,
τ
p
[][1,...,k])
4:
return
PathAssign
[][]
μ
4
3
3
D
2
3
2
A
B
C
1
2
E
1
1
A
B C D
C
D
A
B
(a) Staleness
(b) Linear
(c) Bushy
Fig. 10.
The JAQPOT Algorithm
Fig. 9.
Multi-join plans
=
k+m
2
)form
staleness susceptible states
and k input paths. The cover found
by GH-WMSCP will atmost differ from the optimal cover for WMSCP, denoted as
OPT(WMSCP), by a factor of ln(
m
) [18].
(m
×
Lemma 2.
Fo r
k input streams
in a join query Q, there are exactly
(k-2) staleness
susceptible states
irrespective of the query shape, be it linear, semi-bushy or bushy.
Lemma 2 relates the counts of the input paths (k) and the susceptible states (m) in a
query. For example, in both the plans, namely,
linear
(Fig. 9.b) and
bushy
(Fig. 9.c), k=4
and m=2. Therefore, substituting m with (k-2) in the expression for the time complexity
of GH-WMSCP, TC(GH-WMSCP) =
(k
2
).
4.4
The Integrated JAQPOT Algorithm
We now present our algorithm called
J
oin
A
daptation at
Q
uery plan-level using
P
ath-
productivity for
O
ptimizing
T
hroughput
, in short, JAQPOT (Fig. 10). JAQPOT first as-
signs a fraction
7
out of
available resources towards fulfilling the
freshness
require-
ments using the GH-WMSCP. Further, the greedy knapsack solver GrePP achieves an
optimal
query throughput using the remaining resources (
Δμ
μ
). JAQPOT returns the
join adaptation assignment in
PathAssign
[][],where
PathAssign
[
I
][
j
] denotes the
resources assigned to the j
th
half-way join of Path
I
. The overall time complexity of
our solution TC(JAQPOT) = TC(GH-WMSCP) + TC(GrePP) =
μ
-
Δμ
(k
2
+k
×
log(k))
(k
2
). Thus,
JAQPOT runs in quadratic time of k
irrespective of the plan shape.
Run-time Query Adaptation.
An initially optimal resource allocation by JAQPOT
may become sub-optimal due to the dynamic nature of the streams. Thus, we adopt a
simple yet effective strategy for runtime adaptation (details in [13]). We measured the
runtime overheads of JAQPOT and found that GH-WMSCP incurs the highest cost.
Performance of GH-WMSCP for different parameters, such as sizes of input and sets,
has been thoroughly studied in [18]. Thus, we instead focus our experimental study on
throughput and freshness.
7
We chose to satisfy the
freshness
predicates while
optimizing
throughput as this adaptation
is sufficient for real world applications. We found in our experimental study (Sec. 5) that in
practice realistic
freshness
predicates are indeed fulfillable using only a small share of the
resources.