Database Reference
In-Depth Information
advance the window when it slides. This operator outputs a stream of tuples
of the form ( TS = ts, A = a, B 1 = u 1 ,..., B m = u m ) ++ ( F ( W )) such that W is
a window of tuples from the input stream with values of A between  a  and
a + s  - 1 and values for B 1 ,..., B m of u 1 ,..., u m , respectively, and ts is the smallest
of the timestamps associated with tuples in W . The notation “++” denotes
the concatenation of two tuples. Thus, it is assumed that the function F
returns a tuple of aggregate computations and that this tuple is concatenated
to a tuple consisting of fields that identify the window over which the com-
putation took place ( B 1 ,..., B m , and A ).
Join : This is a binary operator that takes the form Join (P, Size s, Left
Assuming O 1 , Right Assuming O 2 ) (S i , S 2 ) such that P is a predicate over
pairs of tuples from input streams S 1 and S 2 , s is an integer, and O 1 and O 2
are specifications of assumed orderings of S 1 and S 2 , respectively. For every
in-order tuple t in S 1 and u in S 2 , the concatenation of t and u ( t ++ u ) is out-
put if | t.A - u.B | ≤ s and P holds of t and u . The Join operator does need not
sort its inputs to process disordered streams but can instead delay pruning
tuples to account for slack. The Join operator also permits one or both of its
inputs to be static tables. A static table is a special case of a window on a
stream that is infinite in size.
Resample : It is an asymmetric, semi-join-like synchronization operator
that can be used to align pairs of streams. This operator takes the form
Resample(F, Size s, Left Assuming O 1 , Right Assuming O 2 ) (S 1 , S 2 ) such that
F is a window functionover S 1 , s is an integer, A is an attribute over S 1 , and
O 1 and O 2 are specifications of orderings assumed of S 1 and S 2 , respectively.
F or every tuple, t , from S 1 , tuple ( B 1 : u.B 1 ,..., B m : u.B m ,A : t.A ) ++ F ( W ( t ))
is output such that W(t) = u S 2 - u in order wrt O 2 in S 2 ∧ - t.A u.B - ≤ s .
Thus, for every tuple in S 1 , an interpolated value is generated from S 2 using
the interpolation function, F , over a window of tuples of size 2s.
Figure 12.2 illustrates the runtime architecture of Aurora. The basic purpose
of an Aurora run-time network is to process data flows through a potentially large
workflow diagram where inputs from data sources and outputs from boxes are fed
to the router, which forwards them either to external applications or to the storage
manager to be placed on the proper queue. The storage manager is responsible for
maintaining the box queues and managing the buffer.
Conceptually, the scheduler picks a box for execution, ascertains what process-
ing is required, and passes a pointer to the box description to the multithreaded box
processor. The box processor executes the appropriate operation and then forwards
the output tuples to the router. The scheduler then ascertains the next process-
ing step and the cycle is repeated. The QoS monitor continually monitors system
performance and activates the load shedder when it detects an overload situation
and poor system performance. The load shedder then sheds load until the perfor-
mance of the system reaches an acceptable level. The catalog contains information
regarding the network topology, inputs, outputs, QoS information, and relevant
statistics (e.g., selectivity, average box processing costs), which are essentially used
by all components.
Search WWH ::




Custom Search