Database Reference
In-Depth Information
Similarly, even though the map output types and the reduce input types must match, this is
not enforced by the Java compiler.
The type parameters are named differently from the abstract types (
KEYIN
versus
K1
, and
so on), but the form is the same.
If a combiner function is used, then it has the same form as the reduce function (and is an
implementation of
Reducer
), except its output types are the intermediate key and value
types (
K2
and
V2
), so they can feed the reduce function:
map: (K1, V1) → list(K2, V2)
combiner: (K2, list(V2)) → list(K2, V2)
reduce: (K2, list(V2)) → list(K3, V3)
Often the combiner and reduce functions are the same, in which case
K3
is the same as
K2
, and
V3
is the same as
V2
.
The partition function operates on the intermediate key and value types (
K2
and
V2
) and
returns the partition index. In practice, the partition is determined solely by the key (the
value is ignored):
partition: (K2, V2) → integer
Or in Java:
public abstract class
Partitioner
<
KEY
,
VALUE
> {
public abstract
int
getPartition
(
KEY key
,
VALUE value
,
int
numPartitions
);
}