Database Reference
In-Depth Information
Output Formats
Hadoop has output data formats that correspond to the input formats covered in the previ-
Figure 8-4. OutputFormat class hierarchy
Text Output
The default output format,
TextOutputFormat
, writes records as lines of text. Its keys
and values may be of any type, since
TextOutputFormat
turns them to strings by call-
ing
toString()
on them. Each key-value pair is separated by a tab character, although
that may be changed using the
mapre-
duce.output.textoutputformat.separator
property. The counterpart to
TextOutputFormat
for reading in this case is
KeyValueTextInputFormat
, since
it breaks lines into key-value pairs based on a configurable separator (see
KeyValueTextIn-
You can suppress the key or the value from the output (or both, making this output format
equivalent to
NullOutputFormat
, which emits nothing) using a
NullWritable