Database Reference
In-Depth Information
Table 1. Notations used in cost estimation of X-HYBRIDJOIN
Parameter name
Symbol
Total allocated memory ( bytes )
M
Service rate ( processed tuples / sec )
μ
Input size (=number of matching tuples in previous iteration)
w
Stream tuple size ( bytes )
v S
Size of each swappable and non-swappable part ( bytes )(=sizeof
1 disk partition)
v P
Size of disk tuple ( bytes )
v R
d T = v v R
Size of each swappable and non-swappable part ( tuples )
Memory weight for the hash table
α
Memory weight for the queue
1- α
Cost to read one disk partition into the disk buffer ( nanosecs )
c I/O ( v P )
Cost to lookup one tuple in the hash table ( nanosecs )
c H
Cost to generate the output for one tuple ( nanosecs )
c O
Cost to remove one tuple from the hash table and the queue ( nanosecs )
c E
Cost to read one stream tuple into the stream buffer ( nanosecs )
c S
Cost to append one tuple into hash table and the queue ( nanosecs )
c A
Total cost for one loop iteration of X-HYBRIDJOIN ( secs )
c loop
2 v P ).
The total memory used by X-HYBRIDJOIN can be determined by aggregat-
ing all of the above.
α )( M
Memory for the queue = (1
M =2 v P + α ( M
2 v P )+(1
α )( M
2 v P )
(1)
Currently we do not include the memory reserved by the stream buffer because
of its small size (0.05 MB has been sucient in all our experiments).
Processing cost. In this section, we calculate the processing cost for the pro-
posed X-HYBRIDJOIN. We denote the cost for one loop iteration of the algo-
rithm as c loop and express it as the sum of the costs for the individual operations.
We first calculate the processing cost for each component separately.
Cost to read swappable or non-swappable parts of the disk buffer= c I/O ( v P ).
Cost to look-up swappable and non-swappable parts of the disk buffer in the
hash table = d T .c H + d T .c H =2 d T .c H (in the case of HYBRIDJOIN it is d T .c H
only).
Cost to generate the output for w matching tuples = w.c O .
Cost to remove w tuples from the hash table and the queue = w.c E .
Cost to read w tuples from stream S into the stream buffer = w.c S .
Cost to append w tuples in the hash table and the queue = w.c A .
As the non-swappable part of the disk buffer is read only once before the execu-
tion starts we exclude it. By aggregating the terms, the total cost for one loop
iteration is:
c loop ( secs )=10 9 [ c I/O ( v P )+2 d T .c H + w ( c O + c E + c S + c A )]
(2)
Search WWH ::




Custom Search