NoSQL and functional programming - Making Sense of NoSQL

Databases Reference

In-Depth Information

no-side-effect guarantee is critical in your ability to create reproducible transforms

that are easy to debug and easy to optimize. Not allowing external side-effect writes

during a transform keeps transforms running fast.

The second scalability benefit happens when you prohibit external reads during a

transform. This rule allows you to know with certainty that the outputs are completely

driven by the inputs of a transformation. If the time to create a hash of the input is

small relative to the time it takes to run the transform, you can check a cache to see if

the transform has already been run.

One of the central theories of Lambda calculus is that the results of a transform of

any data can be used in place of the actual transform of the data. This ability to sub-

stitute cached value results, instead of having to rerun a long-running transform, is

one way that functional programming systems can be more efficient than imperative

systems.

The ability to rerun a transform many times and not alter data is called an idempo-

tent transform or an idempotent transaction. Idempotent transforms are transforma-

tions that will change the state of the world in consistent ways the first time they're

run, but rerunning the transform many times won't corrupt your data. For example, if

you have a filter that will insert missing required elements into an XML file, that filter

should check to make sure the elements don't already exist before adding them.

Idempotent transforms can also be used in transaction processing. Since idempo-

tent transforms don't change external state, there's no need to create an undo pro-

cess. Additionally, you can use transaction identifiers to guarantee idempotent

transforms. If you're running a transaction on an item of data that increments a bank

account, you might record a transaction ID in the bank account transaction history.

You can then create a rule that only runs the update if that transaction ID hasn't been

run already. This guarantees that a transaction won't be run more than once.

Idempotent transactions allow you to use referential transparency . An expression is

said to be referentially transparent if it can be replaced with its value without chang-

ing the behavior of the program. Any functional programming statement can have

this property if the output of the transform can be replaced with the functional call to

the transform itself. Referential transparency allows both the programmer and the

compiler system to look for ways to optimize repeated calls to the same set of func-

tions on the same set of data. But this optimization technique is only possible when

you move to a functional programming paradigm.

In the next section, we'll take a detailed look at how referential transparency

allows you to cache results from functional programs.

10.1.4

Using referential transparency to avoid recalculating transforms

Now that you know how functional programs promote idempotent transactions, let's

look at how these results can be used to speed up your system. You can use these tech-

niques in many systems, from web applications to NoSQL databases to the results of

MapReduce transforms.

Search WWH ::

Custom Search

Home