Database Reference
In-Depth Information
the network, so as to discover possible joins partners. This causes huge
communication cost which is a disaster for an energy critical setting. An
alternative is let all nodes send their tuples to the base station, where
JOIN operations are performed. Although reduced, the transmission
amount of such approach is still considerable. Besides, the base station
may become a bottleneck for massive data processing [36]. Yang et al.
[18] propose Two-Phase Self Join (TPSJ), processing the above join in
two phases. TPSJ is energy-effective and is applicable on queries having
the following three properties:
1 The join involves two copies of the same relation.
2 The tuples joined are within a specific time window.
3 There is a selection predicate in the ”WHERE” clause.
TPSJ decomposes the original query into two sub-queries, which are
executed sequentially. As an example, the previous query Q 6 is decom-
posed into:
Q7:
SELECT
P.pressure, P.time INTO R_1
FROM
Pressure AS P
P.pressure > threshold ( )
WHERE
Q8:
SELECT
P.pressure, P.time
FROM
R_1, Pressure AS P
WHERE
P.pressure > R_1.pressure
AND
window(R_1.time, P.Time, h)
First, the base station issues Q 7 to the sensors. After Q 7 is executed,
atable R 1 , that contains all tuples satisfying the select predicates, is
obtained at the base station. Then the base station issues Q 2 . R 1 is
also injected into the network along with Q 8 .Sin e R 1 contains all
join candidates, the correctness and completeness of the join results
are ensured. The benefit TPSJ brings is to reduce the unnecessary
transmissions of the useless tuples that do not join. The drawback is that
the table R 1 has to be transmitted twice for one JOIN operation (first
transmitted to the root and then injected into the network). Since the
size of R1 is expected to be small, the overall reduction of transmission
is substantial. Mihaylov et al. [44] summarize three classes of join
strategies, i.e., the grouped join, through-the-base join and the pair-wise
join. In the grouped join, the joined tuples are sent to a specific node
using distributed/geographic hash table. In the through-the-base join,
the tuples from one join party are routed to the join partner through the
base station. In the pair-wise join, the algorithm first establishes a path
between two join partners and then selects a node along this path to
perform the join operation. Furthermore, the authors build cost models
Search WWH ::




Custom Search