Databases Reference
In-Depth Information
Using imperative serial processing
Using functional parallel processing
for loop in JavaScript
for loop in XQuery
for(i=0; i < 5; i++){
n = n + 1;
}
let $seq := ('a','b','c'…)
for $item in $seq
return
my-transform($item)
Figure 10.5 Iterations or for loops in imperative languages calculate one iteration of a loop
and allow the next iteration to use the results of a prior loop. The left panel shows an
example using JavaScript with a mutable variable that's incremented. With some functional
programming languages, iteration can be distributed on independent threads of execution.
The result of one loop can't be used in other loops. An example of an XQuery for loop is
shown in the right panel.
elements serially, with each loop starting only after the prior loop completes, func-
tional programming can process each loop simultaneously and distribute the process-
ing on multiple threads. An example of this is shown in figure 10.5.
As you can see, imperative programming can't process this in parallel because the
state of the variables must be fully calculated before the next loop begins. Some func-
tional programming languages such as XQuery keep each loop as a separate and fully
independent thread. But in some situations, parallel execution isn't desirable, so
there are now proposals to add a sequential option to XQuery functions.
To understand the difference between an imperative and a functional program, it
helps to have a good mental model. The model of a pipe without any holes as shown
in figure 10.6 is a good representation.
The shift of focus from updating mutable variables to only using immutable vari-
ables within independent transforms is the heart of the paradigm shift that underpins
many NoSQL systems. This shift is required so that you can achieve reliable and high-
performance horizontal scaling in multiprocessor data centers.
Figure 10.6 The functional programming paradigm relies on creating a
distinct output for each data input in an isolated transformation process.
You can think of this as a data transformation pipe. When the
transformation of input to output is done without modification of external
memory, it's called a zero-side-effect pipeline. This means you can rerun
the transform many times from any point without worrying about the
impact of external systems. Additionally, if you prevent reads from
external memory during the transformation, you have the added benefit of
knowing the same input must generate the exact same output. Then you
can hash the input and check a cache to see if the transform has already
been done.
Inputs
Memory
Output
 
Search WWH ::




Custom Search