Database Reference
In-Depth Information
public static class
FirstComparator
extends
WritableComparator
{
private static final
Text
.
Comparator
TEXT_COMPARATOR
=
new
Text
.
Comparator
();
public
FirstComparator
() {
super
(
TextPair
.
class
);
}
@Override
public
int
compare
(
byte
[]
b1
,
int
s1
,
int
l1
,
byte
[]
b2
,
int
s2
,
int
l2
) {
try
{
int
firstL1
=
WritableUtils
.
decodeVIntSize
(
b1
[
s1
]) +
readVInt
(
b1
,
s1
);
int
firstL2
=
WritableUtils
.
decodeVIntSize
(
b2
[
s2
]) +
readVInt
(
b2
,
s2
);
return
TEXT_COMPARATOR
.
compare
(
b1
,
s1
,
firstL1
,
b2
,
s2
,
firstL2
);
}
catch
(
IOException e
) {
throw new
IllegalArgumentException
(
e
);
}
}
@Override
public
int
compare
(
WritableComparable a
,
WritableComparable b
) {
if
(
a
instanceof
TextPair
&&
b
instanceof
TextPair
) {
return
((
TextPair
)
a
).
first
.
compareTo
(((
TextPair
)
b
).
first
);
}
return super
.
compare
(
a
,
b
);
}
}
Serialization Frameworks
Although most MapReduce programs use
Writable
key and value types, this isn't man-
dated by the MapReduce API. In fact, any type can be used; the only requirement is a
mechanism that translates to and from a binary representation of each type.
To support this, Hadoop has an API for pluggable serialization frameworks. A serializa-
tion framework is represented by an implementation of
Serialization
(in the
org.apache.hadoop.io.serializer
package).
WritableSerialization
,
for example, is the implementation of
Serialization
for
Writable
types.