Database Reference
In-Depth Information
(1) Selection : Apply a condition C to each tuple in the relation and produce as output only
those tuples that satisfy C . The result of this selection is denoted σ C ( R ).
(2) Projection : For some subset S of the attributes of the relation, produce from each tuple
only the components for the attributes in S . The result of this projection is denoted
π S ( R ).
(3) Union, Intersection , and Difference : These well-known set operations apply to the sets
of tuples in two relations that have the same schema. There are also bag (multiset) ver-
sions of the operations in SQL, with somewhat unintuitive definitions, but we shall not
go into the bag versions of these operations here.
(4) Natural Join : Given two relations, compare each pair of tuples, one from each relation.
If the tuples agree on all the attributes that are common to the two schemas, then pro-
duce a tuple that has components for each of the attributes in either schema and agrees
with the two tuples on each attribute. If the tuples disagree on one or more shared at-
tributes, then produce nothing from this pair of tuples. The natural join of relations R
and S is denoted R S . While we shall discuss executing only the natural join with
MapReduce, all equijoins (joins where the tuple-agreement condition involves equal-
ity of attributes from the two relations that do not necessarily have the same name) can
be executed in the same manner. We shall give an illustration in Example 2.4 .
(5) Grouping and Aggregation : 4 Given a relation R , partition its tuples according to their
values in one set of attributes G , called the grouping attributes . Then, for each group,
aggregate the values in certain other attributes. The normally permitted aggregations
are SUM, COUNT, AVG, MIN, and MAX, with the obvious meanings. Note that MIN
and MAX require that the aggregated attributes have a type that can be compared, e.g.,
numbers or strings, while SUM and AVG require that the type allow arithmetic opera-
tions. We denote a grouping-and-aggregation operation on a relation R by γ X ( R ), where
X is a list of elements that are either
(a) A grouping attribute, or
(b) An expression θ ( A ), where θ is one of the five aggregation operations such as SUM,
and A is an attribute not among the grouping attributes.
The result of this operation is one tuple for each group. That tuple has a component for
each of the grouping attributes, with the value common to tuples of that group. It also
has a component for each aggregation, with the aggregated value for that group. We
shall see an illustration in Example 2.5 .
EXAMPLE 2.4 Let us try to find the paths of length two in the Web, using the relation Links
of Fig. 2.5 . That is, we want to find the triples of URL's ( u, v, w ) such that there is a link
from u to v and a link from v to w . We essentially want to take the natural join of Links with
Search WWH ::




Custom Search