Databases Reference
In-Depth Information
1. Do all your file I/O in the main process, but don't have files open when you invoke
the multiprocessing features.
2. Multiple subprocesses can safely read from the same file, but only open it once the
new process has been created.
3. Have each subprocess write to a different file, and merge them when finished.
Figure 9-2 shows workflow (1). The initial process is responsible for file I/O, and com‐
municates with the subprocesses through queues and other multiprocessing
constructs.
Figure 9-2. Multiprocessing-based approach to using HDF5
One mechanism for “Pythonic” parallel computation is to use “process pools” that dis‐
tribute the work among worker processes. These are instances of multiprocess
ing.Pool , which among other things have a parallel equivalent of the built-in map() :
>>> import string
>>> from multiprocessing import Pool
>>> p = Pool ( 2 ) # Create a 2-process pool
>>> words_in = [ 'hello' , 'some' , 'words' ]
>>> words_out = p . map ( string . upper , words_in )
>>> print words_out
['HELLO', 'SOME', 'WORDS']
Here's an example of using HDF5 with Pool . Suppose we had a file containing a 1D
dataset of coordinate pairs, and we wanted to compute their distance from the origin.
 
Search WWH ::




Custom Search