Database Reference
In-Depth Information
In April 2008, Hadoop won the general-purpose terabyte sort benchmark (as discussed in
A Brief History of Apache Hadoop ) , and one of the optimizations used was keeping the
intermediate data in memory on the reduce side.
More generally, Hadoop uses a buffer size of 4 KB by default, which is low, so you should
increase this across the cluster (by setting io.file.buffer.size ; see also Other Ha-
doop Properties ) .
Search WWH ::




Custom Search