Global Positioning System Reference
In-Depth Information
be exposed to different kinds of users that could directly include spatial
data in the SDW.
Implementing ETL Processes
Different options for developing spatial ETL processes that feed a SDW exist
either as commercial tools (e.g., Data Interoperability from ESRI, Feature
Manipulation Engine (FME) from Safe software, GeoMedia Fusion from
Integraph) or free/open sources (e.g., GeoKettle from Spatialytics, Spatial
Data Integrator from Talend). Since the requirement is to use free software
and we were already familiar with GeoKettle, we chose this tool for our
project.
GeoKettle (Spatialytics 2013a) operates under LGPL and is based
on a Pentaho Data Integration (PDI), also known as Kettle (Pentaho
2013a), extending it with spatial features. It allows integration of different
conventional and spatial data sources and includes connections to
different DBMSs (e.g., Oracle spatial, PostgreSQL), GIS fi les (e.g., ESRI
shape fi le), or geospatial web service (Badard and Dubé 2009). GeoKettle
provides a graphical user interface allowing the specifi cation of jobs and
transformations. The jobs control the fl ow, i.e., execution order of the
transformations during the execution process. The transformations are
defi ned as a set of operations that need to be applied for the extracted
data.
Figure 6 shows an example of the ETL process applied for cleaning a
District shape fi le and loading the geometry into our SDW. Firstly, after
loading the shape fi le, it establishes the required spatial reference system
(SRS); then, it verifi es and eliminates the invalid codes and unnecessary
attributes (the components called Verify district code and Select required data
in Fig. 6), and fi nally, it groups districts with the same name and performs
a spatial union of all geometries conforming a district (the component
called Group districts in Fig. 6). Some invalid geometries are corrected and
the corresponding county foreign key is introduced. Transformed data is
sent the corresponding table in our SDW.
Different ETL processes for cleaning and transforming shape fi les and
for integrating conventional data as explained in the description of our
study case were developed. Having the graphical interface with much
functionality already implemented facilitates the development of ETL
processes.
Search WWH ::




Custom Search