Database Reference
In-Depth Information
Snappy - previously known as Zippy
recollect, the map phase generates intermediate output files, which are then transferred to
reducers for the reduce phase. The output files generated by a map phase can be com-
pressed. The compression allows the intermediate files to be written and read faster.
Snappy is a compression/decompression library developed by Google and can be applied to
perform the compressions of these output files. Snappy is known for its speed of compres-
sion, which in turn improves the speed of the overall operations.
The two properties shown in the following code need to be set in the
mapred-site.xml
file to enable snappy compression during the MapReduce operations:
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>