Achieving High Freshness and Optimal Throughput in CPU-Limited Execution of Multi-join Continuous Queries - Advances in Databases

Database Reference

In-Depth Information

•

While operator scheduling [6, 7] tends to allocate resources at the coarse granular-

ity of query operators, adaptation at a finer granularity within the operators is

required to produce optimal throughput.

•

The existing JDA technique [10] optimizes every join operator individually. In a

pipeline of join operators (Fig. 2), an uncoordinated attempt to optimize opera-

tors

j individually may jeopardize the real goal of optimizing the overall

query throughput. The join operators within a multi-join plan are interdependent ,

namely, an operator depends on the output of its upstream 1 operator(s) for input.

Consideration of operator interdependency is crucial for a successful plan-level

join direction adaptation.

i and

We identify result staleness 2 as a critical issue for CPU-limited processing of

multi-join queries. The biased resource allocation by JDA may potentially aggra-

vate this problem.

•

Proposed Approach. Unlike load shedding that discards data once the system is on

the verge of crashing from overload, we propose to preemptively allocate the available

CPU resources with the goal to achieve maximal throughput. We design, develop and

evaluate a synchronized join adaptation strategy at the plan level that tackles the result

staleness problem while maximizing the overall throughput of the query. We summa-

rize our contributions as follows:

1. We demonstrate that the state-of-the-art JDA [10] technique fails to achieve optimal

throughput for the CPU -limited processing of multi-join plans (Sec. 3).

2. We establish the path productivity metric as the plan-level throughput contribution

of each input stream (Sec. 4.1).

3. We formulate the query throughput maximization as a knapsack problem and pro-

pose a Greedy Path Productivity-based Adaptation (GrePP) to solve it (Sec.

4.2).

4. We identify result staleness as a key challenge under CPU-limited scenarios and

formulate the freshness satisfiability as a weighted set-cover problem (Sec. 4.3).

5. We integrated the above two strategies into the JAQPOT algorithm (Sec. 4.4). To

the best of our knowledge, this is the first solution that guarantees fulfillment of

result freshness predicates while achieving near optimal query throughput. We

further note that JAQPOT achieves this effective adaptation in quadratic time in

the number of input streams .

6. Our experimental study (Sec. 5) demonstrates the superiority of JAQPOT over the

state-of-the-art JDA solution in a large set of tested cases.

2

Preliminaries

2.1 Background

In this paper we focus on multi-join plans composed of sliding window binary join

operators. We assume standard semantics as in CQL [3]. We use the unit-time basis

1

Operators closer to the stream input are upstream and those closer to the query output are

downstream .

2

This challenge is not identified by prior work [4, 9, 16].

Advances in Databases

Search WWH ::

Custom Search

Home