Database Reference
In-Depth Information
Table 8-4. Input path and filter properties
Property name
Type
Default
value
Description
mapreduce.input.fileinputformat.inputdir Comma-sep-
arated paths
None The input files for a job.
Paths that contain commas
should have those commas
escaped by a backslash
character. For example,
the glob {a,b} would be
escaped as {a\,b} .
None The filter to apply to the
input files for a job.
mapreduce.input.pathFilter.class
PathFilter
classname
FileInputFormat input splits
Given a set of files, how does FileInputFormat turn them into splits? FileIn-
putFormat splits only large files — here, “large” means larger than an HDFS block.
The split size is normally the size of an HDFS block, which is appropriate for most applic-
ations; however, it is possible to control this value by setting various Hadoop properties,
as shown in Table 8-5 .
Search WWH ::




Custom Search