Database Reference
In-Depth Information
Tutorial Links
There's an excellent overview of the technology as well as a tutorial available on this web
page .
Example Code
We'll use streaming to compute the average ranking for Dune . Let's start with our small
dataset:
Kevin,Dune,10
Marshall,Dune,1
Kevin,Casablanca,5
Bob,Blazing Saddles,9
The mapper function could be:
#! /usr/bin/python
import sys
for line in sys.stdin:
line = line.strip()
keys = line.split(',')
print( "%s\t%s" % (keys[1], keys[2]) )
The reducer function could be:
#!/usr/bin/python
import sys
count = 0
rating_sum = 0
for input_line in sys.stdin:
input_line = input_line.strip()
title, rating = input_line.split("\t", 1)
rating = float(rating)
if title == 'Dune':
count += 1
rating_sum += rating
Search WWH ::




Custom Search