Database Reference
In-Depth Information
-combiner ch02-mr-intro/src/main/ruby/max_temperature_reduce.rb \
-reducer ch02-mr-intro/src/main/ruby/max_temperature_reduce.rb
Note also the use of -files , which we use when running Streaming programs on the
cluster to ship the scripts to the cluster.
Python
Streaming supports any programming language that can read from standard input and
write to standard output, so for readers more familiar with Python, here's the same ex-
ample again. [ 24 ] The map script is in Example 2-9 , and the reduce script is in
Example 2-10 .
Example 2-9. Map function for maximum temperature in Python
#!/usr/bin/env python
import re
import sys
for line in sys . stdin :
val = line . strip ()
( year , temp , q ) = ( val [ 15 : 19 ], val [ 87 : 92 ], val [ 92 : 93 ])
if ( temp != "+9999" and re . match ( "[01459]" , q )):
print " %s \t %s " % ( year , temp )
Example 2-10. Reduce function for maximum temperature in Python
#!/usr/bin/env python
import sys
( last_key , max_val ) = ( None , - sys . maxint )
for line in sys . stdin :
( key , val ) = line . strip (). split ( " \t " )
if last_key and last_key != key :
print " %s \t %s " % ( last_key , max_val )
( last_key , max_val ) = ( key , int ( val ))
else :
( last_key , max_val ) = ( key , max ( max_val , int ( val )))
if last_key :
print " %s \t %s " % ( last_key , max_val )
We can test the programs and run the job in the same way we did in Ruby. For example, to
run a test:
Search WWH ::




Custom Search