Improving Performance with Parallel Programming - Clojure Data Analysis

Database Reference

In-Depth Information

How it works…

There are a couple of things we should talk about here. Primarily, we need to look at chunking

the inputs for pmap , but we should also discuss Monte Carlo methods.

Estimating with Monte Carlo simulations

Monte Carlo simulations work by throwing random data at a problem that is fundamentally

deterministic and when it's practically infeasible to attempt a more straightforward solution.

Calculating pi is one example of this. By randomly illing in points in a unit square, π/4 will

be approximately the ratio of the points that fall within a circle centered at 0, 0. The more

random points we use, the better the approximation is.

Note that this is a good demonstration of Monte Carlo methods, but it's a terrible way to

calculate pi. It tends to be both slower and less accurate than other methods.

Although not good for this task, Monte Carlo methods are used for designing heat shields,

simulating pollution, ray tracing, inancial option pricing, evaluating business or inancial

products, and many other things.

For a more in-depth discussion, you can refer to Wikipedia,

which has a good introduction to Monte Carlo methods, at

http://en.wikipedia.org/wiki/Monte_Carlo_method .

Chunking data for pmap

The table present in this section makes it clear that partitioning helps. The partitioned version

took roughly the same amount of time as the serial version, while the naïve parallel version

took almost three times longer.

The speedup between the naïve and chunked parallel versions is because each thread is

able to spend longer on each task. There is a performance penalty on spreading the work

over multiple threads. Context switching (that is, switching between threads) costs time as

does coordinating between threads. However, we expect to be able to make that time

(and more) by doing more things at once.

However, if each task itself doesn't take long enough, then the beneit won't outweigh

the costs. Chunking the input, and effectively creating larger individual tasks for each

thread, gets around this by giving each thread more to do, thereby spending less time

in context switching and coordinating, relative to the overall time spent in running.

Search WWH ::

Custom Search

Home