NoSQL and functional programming - Making Sense of NoSQL - page 214

Databases Reference

In-Depth Information

Using imperative serial processing

Using functional parallel processing

for loop in JavaScript

for loop in XQuery

for(i=0; i < 5; i++){

n = n + 1;

}

let $seq := ('a','b','c'…)

for $item in $seq

return

my-transform($item)

Figure 10.5 Iterations or for loops in imperative languages calculate one iteration of a loop

and allow the next iteration to use the results of a prior loop. The left panel shows an

example using JavaScript with a mutable variable that's incremented. With some functional

programming languages, iteration can be distributed on independent threads of execution.

The result of one loop can't be used in other loops. An example of an XQuery for loop is

shown in the right panel.

elements serially, with each loop starting only after the prior loop completes, func-

tional programming can process each loop simultaneously and distribute the process-

ing on multiple threads. An example of this is shown in figure 10.5.

As you can see, imperative programming can't process this in parallel because the

state of the variables must be fully calculated before the next loop begins. Some func-

tional programming languages such as XQuery keep each loop as a separate and fully

independent thread. But in some situations, parallel execution isn't desirable, so

there are now proposals to add a sequential option to XQuery functions.

To understand the difference between an imperative and a functional program, it

helps to have a good mental model. The model of a pipe without any holes as shown

in figure 10.6 is a good representation.

The shift of focus from updating mutable variables to only using immutable vari-

ables within independent transforms is the heart of the paradigm shift that underpins

many NoSQL systems. This shift is required so that you can achieve reliable and high-

performance horizontal scaling in multiprocessor data centers.

Figure 10.6 The functional programming paradigm relies on creating a

distinct output for each data input in an isolated transformation process.

You can think of this as a data transformation pipe. When the

transformation of input to output is done without modification of external

memory, it's called a zero-side-effect pipeline. This means you can rerun

the transform many times from any point without worrying about the

impact of external systems. Additionally, if you prevent reads from

external memory during the transformation, you have the added benefit of

knowing the same input must generate the exact same output. Then you

can hash the input and check a cache to see if the transform has already

been done.

Inputs

Memory

Output

Next Page

Making Sense of NoSQL

Search WWH ::

Custom Search

Home