Databases Reference
In-Depth Information
Jonathan Reichental, CIO for the City of Palo Alto; and Diego May, CEO of Junar, the
company that provided the data infrastructure for this initiative and many others.
Thinking about Palo Alto and its Open Data initiative, a few ideas come to mind. The
city is generally quite a pleasant place: the weather is temperate, there are lots of parks
with enormous trees, most of downtown is quite walkable, and it's not particularly
crowded. On a summer day in Palo Alto, one of the last things anybody really wants is
to be stuck in an office on a long phone call. Instead people walk outside and take their
calls, probably heading toward a favorite espresso bar or a popular frozen yogurt shop.
On a hot summer day in Palo Alto, knowing a nice route to walk in the shade would be
great. There must be a smartphone app for that—but as of late 2012, there wasn't!
In this chapter, we'll build an example Cascading workflow for that smartphone app as
a case study. A sample app is shown in both Java and Clojure to power a mobile data
API.
Imagine a mobile app that leverages the city's municipal data to personalize recom‐
mendations: “Find a shady spot on a summer day in which to walk near downtown Palo
Alto. While on a long conference call. Sippin' a latte or enjoying some fro-yo.” This app
shows the process of structuring data as a workflow, progressing from raw sources to
refine that process until we obtain the data products for that recommender. The results
are personalized based on the neighborhoods where a person tends to walk.
To download source code, first connect to a directory on your computer where you have
a few gigabytes of available disk space, and then use Git to clone the source code repo:
$ git clone git://github.com/Cascading/CoPA.git
Once that download completes, connect into that newly cloned directory. Source code
is shown in both Cascading (Java) and Cascalog (Clojure). We'll work through the Cas‐
calog example, and its source is located in the src/main/clj/copa/core.clj file.
Moving from Raw Sources to Data Products
The City of Palo Alto has its Open Data portal available online . It publishes a wide range
of different data sets: budget history, census data, geographic information systems (GIS)
as shown in Figure 8-1 , building permits, utility consumption rates, street sweeping
schedules, creek levels, etc.
Search WWH ::




Custom Search