Achieving High Freshness and Optimal Throughput in CPU-Limited Execution of Multi-join Continuous Queries - Advances in Databases

Database Reference

In-Depth Information

an exception of [4, 16] focus on optimizing a single join operator or a single MJoin

operator [9]. Tatbul et al. [16] first applied load shedding to streaming databases. As

indicated in [16], they do not address the additional issues related to processing win-

dowed joins over streams. Ayad et al. [4] propose static optimization and in the absence

of a feasible plan they pick a plan augmented with shedding operators placed on the

input streams to make it feasible. GrubJoin [9] targets the MJoin operator by leveraging

time correlation-awareness . It focuses on a single MJoin operator, whereas our work

tackles an orthogonal problem of operator interdependencies within a plan. Although

MJoins utilize less memory, they are typically computationally expensive [17] and are

less likely to be selectd by the query optimizer in a CPU-limited scenario.

Closest to our work, join direction adaptation (JDA) [8, 10] explores the half-way

join productivity to selectively allocate computing resources to maximize the output

rate. They focus on a single join operator only. In this work, we establish that such

traditional JDA technique becomes ineffective for multi-join queries. Further, all these

approaches typically address a single optimizing function. None of these approaches

focus on leveraging the inter-operator dependency to adapt to run-time fluctuations nor

do they consider result staleness . Whenever a query with interconnected join operators

is used, our solution leveraging operator interdependency can be applied in conjunction

with the existing approaches [9, 17].

While operator scheduling [6, 7] tends to allocate resources at the coarse granu-

larity of a query operators, we focus on adaptation at a finer granularity of half-way

joins within an operator for optimizing throughput. Our work utilizes an adaptive query

processing [13] framework for adjusting the join direction of the query plan at run-time.

7Con lu ion

This paper addresses the CPU-limited execution of multi-join queries using join direc-

tion adaptation. We propose the path productivity metric that leverages the operator

interdependencies instead of localized operator-centric optimization. We identify result

staleness as a pressing issue under CPU limitations, and throughput optimizing tech-

niques further aggravate it. Our key contribution is the integrated JAQPOT algorithm

that tackles the result staleness problem while producing optimal query throughput. We

validate our analytical findings using experimental studies with both synthetic and real

data.

Acknowledgements. We are grateful to Song Wang, Luping Ding and other DSRG

members for their efforts in building the CAPE system. We thank Prof. Murali Mani

and the anonymous reviewers for their insightful comments.

References

1. Weatherboards dataset from intel berkeley research lab,

2. Abadi, D.J., Carney, D., et al.: Aurora: a new model and architecture for data stream man-

agement. VLDB 12(2) (2003)

Search WWH ::

Custom Search

Home