Database Reference
In-Depth Information
FileInputFormat
FileInputFormat
is the base class for all implementations of
InputFormat
that
use files as their data source (see
Figure 8-2
). It provides two things: a place to define
which files are included as the input to a job, and an implementation for generating splits
for the input files. The job of dividing splits into records is performed by subclasses.
Figure 8-2. InputFormat class hierarchy
FileInputFormat input paths
The input to a job is specified as a collection of paths, which offers great flexibility in
constraining the input.
FileInputFormat
offers four static convenience methods for
setting a
Job
's input paths:
public static
void
addInputPath
(
Job job
,
Path path
)
public static
void
addInputPaths
(
Job job
,
String commaSeparatedPaths
)
public static
void
setInputPaths
(
Job job
,
Path
...
inputPaths
)
public static
void
setInputPaths
(
Job job
,
String commaSeparatedPaths
)