Database Reference
In-Depth Information
WARNING
In some situations, the byte array returned by the
getBytes()
method may be longer than the length
returned by
getLength()
:
Text t
=
new
Text
(
"hadoop"
);
t
.
set
(
new
Text
(
"pig"
)
);
assertThat
(
t
.
getLength
(),
is
(
3
));
assertThat
(
"Byte length not shortened"
,
t
.
getBytes
().
length
,
is
(
6
)
);
This shows why it is imperative that you always call
getLength()
when calling
getBytes()
, so
you know how much of the byte array is valid data.
Resorting to String
Text
doesn't have as rich an API for manipulating strings as
java.lang.String
, so
in many cases, you need to convert the
Text
object to a
String
. This is done in the
usual way, using the
toString()
method:
assertThat
(
new
Text
(
"hadoop"
).
toString
(),
is
(
"hadoop"
));
BytesWritable
BytesWritable
is a wrapper for an array of binary data. Its serialized format is a
4-byte integer field that specifies the number of bytes to follow, followed by the bytes
themselves. For example, the byte array of length 2 with values 3 and 5 is serialized as a
4-byte integer (
00000002
) followed by the two bytes from the array (
03
and
05
):
BytesWritable b
=
new
BytesWritable
(
new
byte
[] {
3
,
5
});
byte
[]
bytes
=
serialize
(
b
);
assertThat
(
StringUtils
.
byteToHexString
(
bytes
),
is
(
"000000020305"
));
BytesWritable
is mutable, and its value may be changed by calling its
set()
meth-
od. As with
Text
, the size of the byte array returned from the
getBytes()
method for
BytesWritable
— the capacity — may not reflect the actual size of the data stored in
the
BytesWritable
. You can determine the size of the
BytesWritable
by calling
getLength()
. To demonstrate:
b
.
setCapacity
(
11
);
assertThat
(
b
.
getLength
(),
is
(
2
));
assertThat
(
b
.
getBytes
().
length
,
is
(
11
));