Database Reference
In-Depth Information
the
probe
. All input tuples undergo the
insert
and
purge
operations. In Fig. 4.a A
B
have input rates
= 300 tuples/sec as the
available resources. Therefore, a subset
3
of 300 tuples out of the 1200 (= 500 + 700)
tuples from either of the input streams is used for the
probe
operation. However, all
1200 arriving tuples undergo
insert
and
purge
operations every time unit. Thus, the
λ
a
= 500 and
λ
b
= 700 tuples/sec. Assume
μ
μ
λ
a
resources (here 300 tuples/sec) must be divided among the
probe allowances
(here
λ
b
) for throughput maximization.
and
2.2
Problem Definition
Now, we define our two target problems, namely, achieving
optimal throughput
and
tackling
result staleness
in CPU-limited execution of
multi-join
.
CPU-limited Execution of Multi-Join Plans.
In the 2-join plan of Fig. 4.b
4
composed
of
2
,the
throughput optimization
problem is quite different from the single
operator case. Now, the goal is to maximize the
throughput
r
root
of the root operator
(here
1
and
2
).
λ
a
×σ
|+λ
b
×σ
λ
ab
×σ
|+λ
c
×σ
r
1
=
×|
S
B
×|
S
A
|
;
r
2
=
×|
|,
λ
a
+λ
b
+λ
ab
+λ
c
≤μ.
S
C
×|
S
AB
B
A
C
AB
(3)
Equation 3 depicts the CPU-limited case in a
multi-join
plan, where
μ
needs to be
λ
a
,
λ
b
,
λ
ab
λ
c
. In general,
divided among four half-way joins, namely,
and
μ
gets split
at two levels. First, among the
n
join operators (say
μ
1
,
μ
2
,
...
,
μ
j
,
...
,
μ
n
). Then, for
each join
μ
j
.
Freshness of Multi-Join Results.
When the resources become limited, the produced
query results may no longer be perfectly
fresh
, as in Query Q1 (Fig. 1). The result fresh-
ness is further compromised by throughput optimizing resource allocation to highly
productive half-way joins. Consequently, little or no resources are left for the less pro-
ductive components of the plan. Therefore, under a
throughput optimizing
scheme, in-
sufficient scheduling of certain operators may lead to their
starvation
. Moreover, when
a
starved
upstream operator does not produce sufficient intermediate results, the de-
pendent join state in the downstream operator tends to become
stale
. The join results
produced using such stale states are also stale, thus further deteriorating the
result fresh-
ness
.InFig.4.b,if(c
j
,
μ
j
is divided among each of its respective half-way joins
μ
j
and
S
AB
) is most productive, the assignment of complete
μ
to
λ
c
would
starve
1
, leading to the
staleness
of the state S
AB
and eventually also to that of
the final query results.
Definition 1.
The
freshness
predicate, namely,
I
for a stream I, requires that joined
tuples produced beyond time T must not contain stream I tuples with arrival times older
than
F
|
F
I
|
.
Under CPU-limited execution, the user can supply a
freshness
predicate
T-
I
for each
stream I (Def. 1). The type of the
freshness
and the window predicate must be the same,
F
3
A fine-grained
time correlation-awareness
[9] can be used for subset selection along with JDA.
4
For simplicity
σ
i
denotes overall selectivity of
i
; in reality each half-way join has an associ-
ated selectivity as in Fig. 4.a.