Database Reference
In-Depth Information
While operator scheduling [6, 7] tends to allocate resources at the coarse granular-
ity of query operators, adaptation at a finer granularity within the operators is
required to produce optimal throughput.
The existing JDA technique [10] optimizes every join operator individually. In a
pipeline of join operators (Fig. 2), an uncoordinated attempt to optimize opera-
tors
j individually may jeopardize the real goal of optimizing the overall
query throughput. The join operators within a multi-join plan are interdependent ,
namely, an operator depends on the output of its upstream 1 operator(s) for input.
Consideration of operator interdependency is crucial for a successful plan-level
join direction adaptation.
i and
We identify result staleness 2 as a critical issue for CPU-limited processing of
multi-join queries. The biased resource allocation by JDA may potentially aggra-
vate this problem.
Proposed Approach. Unlike load shedding that discards data once the system is on
the verge of crashing from overload, we propose to preemptively allocate the available
CPU resources with the goal to achieve maximal throughput. We design, develop and
evaluate a synchronized join adaptation strategy at the plan level that tackles the result
staleness problem while maximizing the overall throughput of the query. We summa-
rize our contributions as follows:
1. We demonstrate that the state-of-the-art JDA [10] technique fails to achieve optimal
throughput for the CPU -limited processing of multi-join plans (Sec. 3).
2. We establish the path productivity metric as the plan-level throughput contribution
of each input stream (Sec. 4.1).
3. We formulate the query throughput maximization as a knapsack problem and pro-
pose a Greedy Path Productivity-based Adaptation (GrePP) to solve it (Sec.
4.2).
4. We identify result staleness as a key challenge under CPU-limited scenarios and
formulate the freshness satisfiability as a weighted set-cover problem (Sec. 4.3).
5. We integrated the above two strategies into the JAQPOT algorithm (Sec. 4.4). To
the best of our knowledge, this is the first solution that guarantees fulfillment of
result freshness predicates while achieving near optimal query throughput. We
further note that JAQPOT achieves this effective adaptation in quadratic time in
the number of input streams .
6. Our experimental study (Sec. 5) demonstrates the superiority of JAQPOT over the
state-of-the-art JDA solution in a large set of tested cases.
2
Preliminaries
2.1 Background
In this paper we focus on multi-join plans composed of sliding window binary join
operators. We assume standard semantics as in CQL [3]. We use the unit-time basis
1
Operators closer to the stream input are upstream and those closer to the query output are
downstream .
2
This challenge is not identified by prior work [4, 9, 16].
Search WWH ::




Custom Search