Database Reference
In-Depth Information
[
64
]
In some applications, it's common for some of the input to already be sorted, or at least partially sorted.
For example, the weather dataset is ordered by time, which may introduce certain biases, making the
Ran-
domSampler
a safer choice.
[
65
]
For simplicity, these custom comparators as shown are not optimized; see
Implementing a RawCompar-
ator for speed
for the steps we would need to take to make them faster.
doop.filecache.DistributedCache.
[
67
]
In Hadoop 1, localized files were not always symlinked, so it was sometimes necessary to retrieve local-
ized file paths using methods on
JobContext
. This limitation was removed in Hadoop 2.