NoSQL and functional programming - Making Sense of NoSQL - page 216

Databases Reference

In-Depth Information

Start order

3

1

2

Execution time

5 sec

2 sec

3 sec

Figure 10.8 Functional programming

means that you can't guarantee the

order in which items will be transformed

or what items will finish first.

This variation in transformation times and the location of the input data adds an addi-

tional burden on the scheduling system. In distributed systems, input data is repli-

cated on multiple nodes in the cluster. To be efficient, you want the longest-running

jobs to start first. To maximize your resources, the tools that place tasks on different

nodes in a cluster must be able to gather processing information from multiple data

sources. Schedulers that can determine how long a transform will take to run on dif-

ferent nodes and how busy each node is will be most efficient. This information is gen-

erally not provided by imperative systems. Even mature systems like HDFS and

MapReduce continue to refine their ability to efficiently transform large datasets.

10.1.3

Comparing imperative and functional programming at scale

Now let's compare the capability of imperative and functional systems to support pro-

cessing large amounts of shared data being accessed by many concurrent CPU s. A

comparison of imperative versus functional pipelines is shown in figure 10.9.

You can see that when you prevent writes during a transformation, you get the ben-

efit of no side effects. This means that you can restart a failed transformation and be

certain that if it didn't finish, the external state of a system wasn't already updated.

With imperative systems, you can't make this guarantee. Any external changes may

need to be undone if there's a failure during a transformation. Keeping track of which

operations have been done can add complexity that will slow large systems down. The

Imperative programming

Data in

Functional programming

Data in

Figure 10.9 Imperative programming (left

panel) and functional programming (right

panel) use different rules when transforming

data. To gain the benefits of referential

transparency, output of a transform must be

completely determined by the inputs to the

transform. No other memory should be read or

written during the transformation process.

Instead of a pipe with holes on the left, you can

visualize your transformation pipes as having

solid steel sides that don't transfer any

information.

Data leakage

Data leakage

Solid

steel

sides

Data leakage

Data leakage

Side

effects

Data out

Data out

Next Page

Making Sense of NoSQL

Search WWH ::

Custom Search

Home