Databases Reference
In-Depth Information
to make a statement about a query and its expected results. These tests duplicate the
unit tests that were used in
“Example 6: TF-IDF with Testing” on page 41
. Midje also
has features for stubs and mocks. Ritchie explains how Midje in Cascalog represents a
game-changer for testing MapReduce apps:
Without proper tests, Hadoop developers can't help but be scared of making changes to
production code. When creativity might bring down a workflow, it's easiest to get it
working once and leave it alone.
1
This approach is not just better than the
state of the art
of MapReduce testing, as defined
by Cloudera; it completely obliterates the old way of thinking, and makes it possible to
build very complex workflows with a minimum of uncertainty.
2
— Sam Ritchie
Incorporating TDD, assertions, traps, and checkpoints into the Cascalog workflow
macro was sheer brilliance, for Enterprise data workflows done right. Moreover, fact-
based tests separate a Cascalog app's logic from concerns about how its data is stored—
reducing the complexity of required testing.
To run the tests for
“Example 6 in Cascalog: TF-IDF with Testing”
:
$
lein
test
Retrieving org/clojure/clojure/maven-metadata.xml
(
2k
)
from http://repo1.maven.org/maven2/
Retrieving org/clojure/clojure/maven-metadata.xml
(
1k
)
from https://clojars.org/repo/
Retrieving org/clojure/clojure/maven-metadata.xml
(
2k
)
from http://repo1.maven.org/maven2/
Retrieving org/clojure/clojure/maven-metadata.xml
from http://oss.sonatype.org/content/repositories/snapshots/
Retrieving org/clojure/clojure/maven-metadata.xml
from http://oss.sonatype.org/content/repositories/releases/
lein
test
impatient.core-test
Ran 2 tests containing 2 assertions.
0 failures, 0 errors.
Again,
a gist on GitHub
shows a log of this run.
Cascalog Technology and Uses
A common critique from programmers who aren't familiar with Clojure is that they
would need to learn Lisp. Actually, the real learning curve for Cascalog is more often