Database Reference
In-Depth Information
# Returns a message about number being checked as prime or not
def find_primes(number):
#For each number in potential_list:
print number
return '%d is prime? %r' % (number, prime_check(number))
# Add our functions to the namespace of our running engines
dview.push({'find_primes': find_primes})
dview.push({'prime_check': prime_check})
# Generate some random large integers
np.random.seed(seed=12345)
possible_primes = np.random.random_integers(1000000, 20000000, 10).tolist()
# Run the functions on our cluster
results = dview.map(find_primes,possible_primes)
# Print the results to std out
for result in results.get():
print result
# time ipython prime_finder.py
# Result:
# 17645405 is prime? False
# ...
# 1667625154 is prime? False
# time output:
# real 0m1.711s
On my multicore-processor laptop, the parallelized version using six engines took
only just over 1.7 seconds, a significant speed improvement. If you have access to a
cluster of multicore machines, you could possibly speed this type of brute force appli-
cation up even more, with some additional configuration work. Remember that at
some point the problem becomes IO-bound, and latency in the network may cause
some performance issues.
Summary
R's functional programming model and massive collection of libraries has made it the
de facto open-source science and statistics language. At the same time, Python has
come of age as a productive programming language for memory-intensive data appli-
cations. The sheer number of Python developers and the ease of development give
Python a unique advantage over other methods of building CPU-bound data applica-
tions. Python can often be the easiest way to solve a wide variety of data challenges
in the shortest amount of time. Building an application using a more general-purpose
 
 
Search WWH ::




Custom Search