Databases Reference
In-Depth Information
no-side-effect guarantee is critical in your ability to create reproducible transforms
that are easy to debug and easy to optimize. Not allowing external side-effect writes
during a transform keeps transforms running fast.
The second scalability benefit happens when you prohibit external reads during a
transform. This rule allows you to know with certainty that the outputs are completely
driven by the inputs of a transformation. If the time to create a hash of the input is
small relative to the time it takes to run the transform, you can check a cache to see if
the transform has already been run.
One of the central theories of Lambda calculus is that the results of a transform of
any data can be used in place of the actual transform of the data. This ability to sub-
stitute cached value results, instead of having to rerun a long-running transform, is
one way that functional programming systems can be more efficient than imperative
systems.
The ability to rerun a transform many times and not alter data is called an idempo-
tent transform or an idempotent transaction. Idempotent transforms are transforma-
tions that will change the state of the world in consistent ways the first time they're
run, but rerunning the transform many times won't corrupt your data. For example, if
you have a filter that will insert missing required elements into an XML file, that filter
should check to make sure the elements don't already exist before adding them.
Idempotent transforms can also be used in transaction processing. Since idempo-
tent transforms don't change external state, there's no need to create an undo pro-
cess. Additionally, you can use transaction identifiers to guarantee idempotent
transforms. If you're running a transaction on an item of data that increments a bank
account, you might record a transaction ID in the bank account transaction history.
You can then create a rule that only runs the update if that transaction ID hasn't been
run already. This guarantees that a transaction won't be run more than once.
Idempotent transactions allow you to use referential transparency . An expression is
said to be referentially transparent if it can be replaced with its value without chang-
ing the behavior of the program. Any functional programming statement can have
this property if the output of the transform can be replaced with the functional call to
the transform itself. Referential transparency allows both the programmer and the
compiler system to look for ways to optimize repeated calls to the same set of func-
tions on the same set of data. But this optimization technique is only possible when
you move to a functional programming paradigm.
In the next section, we'll take a detailed look at how referential transparency
allows you to cache results from functional programs.
10.1.4
Using referential transparency to avoid recalculating transforms
Now that you know how functional programs promote idempotent transactions, let's
look at how these results can be used to speed up your system. You can use these tech-
niques in many systems, from web applications to NoSQL databases to the results of
MapReduce transforms.
Search WWH ::




Custom Search