Database Reference
In-Depth Information
Snappy - previously known as Zippy
In Chapter 2 , HDFS and MapReduce , we discussed the MapReduce flow in detail. If you
recollect, the map phase generates intermediate output files, which are then transferred to
reducers for the reduce phase. The output files generated by a map phase can be com-
pressed. The compression allows the intermediate files to be written and read faster.
Snappy is a compression/decompression library developed by Google and can be applied to
perform the compressions of these output files. Snappy is known for its speed of compres-
sion, which in turn improves the speed of the overall operations.
The two properties shown in the following code need to be set in the mapred-site.xml
file to enable snappy compression during the MapReduce operations:
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
Search WWH ::




Custom Search