Database Reference
In-Depth Information
public class WholeFileInputFormat
extends FileInputFormat < NullWritable , BytesWritable > {
@Override
protected boolean isSplitable ( JobContext context , Path file ) {
return false ;
}
@Override
public RecordReader < NullWritable , BytesWritable > createRecordReader (
InputSplit split , TaskAttemptContext context ) throws IOException ,
InterruptedException {
WholeFileRecordReader reader = new WholeFileRecordReader ();
reader . initialize ( split , context );
return reader ;
}
}
WholeFileInputFormat defines a format where the keys are not used, represented
by NullWritable , and the values are the file contents, represented by BytesWrit-
able instances. It defines two methods. First, the format is careful to specify that input
files should never be split, by overriding isSplitable() to return false . Second,
we implement createRecordReader() to return a custom implementation of Re-
cordReader , which appears in Example 8-3 .
Example 8-3. The RecordReader used by WholeFileInputFormat for reading a whole file
as a record
class WholeFileRecordReader extends RecordReader < NullWritable ,
BytesWritable > {
private FileSplit fileSplit ;
private Configuration conf ;
private BytesWritable value = new BytesWritable ();
private boolean processed = false ;
@Override
public void initialize ( InputSplit split , TaskAttemptContext context )
throws IOException , InterruptedException {
this . fileSplit = ( FileSplit ) split ;
this . conf = context . getConfiguration ();
}
@Override
public boolean nextKeyValue () throws IOException ,
Search WWH ::




Custom Search