Database Reference
In-Depth Information
an exception of [4, 16] focus on optimizing a
single
join operator or a single MJoin
operator [9]. Tatbul et al. [16] first applied load shedding to streaming databases. As
indicated in [16], they
do not
address the additional issues related to processing win-
dowed joins over streams. Ayad et al. [4] propose static optimization and in the absence
of a feasible plan they pick a plan augmented with shedding operators placed on the
input streams to make it feasible. GrubJoin [9] targets the MJoin operator by leveraging
time correlation-awareness
. It focuses on a
single
MJoin operator, whereas our work
tackles an orthogonal problem of operator interdependencies within a plan. Although
MJoins utilize less memory, they are typically computationally expensive [17] and are
less likely to be selectd by the query optimizer in a CPU-limited scenario.
Closest to our work,
join direction adaptation
(JDA) [8, 10] explores the
half-way
join productivity
to
selectively allocate
computing resources to maximize the output
rate. They focus on a
single
join operator only. In this work, we establish that such
traditional JDA technique becomes ineffective for multi-join queries. Further, all these
approaches typically address a single optimizing function. None of these approaches
focus on leveraging the inter-operator dependency to adapt to run-time fluctuations nor
do they consider
result staleness
. Whenever a query with interconnected join operators
is used, our solution leveraging operator
interdependency
can be applied in conjunction
with the existing approaches [9, 17].
While
operator scheduling
[6, 7] tends to allocate resources at the
coarse
granu-
larity of a query operators, we focus on adaptation at a
finer
granularity of half-way
joins within an operator for optimizing throughput. Our work utilizes an adaptive query
processing [13] framework for adjusting the join direction of the query plan at run-time.
7Con lu ion
This paper addresses the CPU-limited execution of multi-join queries using join direc-
tion adaptation. We propose the
path productivity
metric that leverages the
operator
interdependencies
instead of localized operator-centric optimization. We identify
result
staleness
as a pressing issue under CPU limitations, and throughput optimizing tech-
niques further aggravate it. Our key contribution is the integrated JAQPOT algorithm
that tackles the
result staleness
problem while producing
optimal
query throughput. We
validate our analytical findings using experimental studies with both synthetic and real
data.
Acknowledgements.
We are grateful to Song Wang, Luping Ding and other DSRG
members for their efforts in building the CAPE system. We thank Prof. Murali Mani
and the anonymous reviewers for their insightful comments.
References
1. Weatherboards dataset from intel berkeley research lab,
2. Abadi, D.J., Carney, D., et al.: Aurora: a new model and architecture for data stream man-
agement. VLDB 12(2) (2003)