Databases Reference
In-Depth Information
There are a few important theoretical aspects embodied by these data workflow ab‐
stractions based on Cascading. Those elements of theory can best be explained as layers
in the process of structuring data:
• Pattern language
• Literate programming
• Separation of concerns
• Functional relational paradigm
Pattern Language
Formally speaking, Cascading represents a pattern language. The notion of a pattern
language is that the syntax of the language constrains what can be expressed to help
ensure best practices. Stated in another way, a pattern language conveys expertise. For
example, consider how a child builds a tower out of Lego blocks. The blocks snap to‐
gether in predictable ways, allowing for complex structures that are reasonably sturdy.
When the blocks are not snapped together properly, those structures tend to fall over.
Lego blocks therefore provide a way of conveying expertise about building toy struc‐
tures.
Use of pattern language came from architecture, based on work by Christopher
Alexander on the “Oregon Experiment.” Kent Beck and Ward Cunningham
subsequently used it to describe software design patterns, popularized by the “Gang of
Four”—Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides—for object-
oriented programming. Abstract Factory, Model-View-Controller (MVC), and Facade
are examples of well-known software design patterns.
Cascading uses pattern language to ensure best practices specifically for robust, parallel
data workflows at scale. We see the pattern syntax enforced in several ways. For example,
flows must have at least one source and at least one tail sink defined. For another ex‐
ample, aggregator functions such as Count must be used in an Every ; in other words,
that work gets performed in a reduce task.
Another benefit of pattern language in Cascading is that it promotes code reuse. Rather,
it reduces the need for writing custom operations because much of the needed business
process can be defined by combing existing components. In a larger context, this is
related to the use of patterns in enterprise application integration (EAI).
Search WWH ::




Custom Search