Database Reference
In-Depth Information
be easily understood by the stakeholders. The conceptual model for ETL
processes is based on the BPMN standard, relying on the assumption that
ETL processes are similar to business processes. We illustrated the design
and implementation of ETL processes with a complete example based on
the Northwind case study. Thus, the reader can have a clear idea of the
usual tasks that must be performed while implementing such processes. We
provided three versions of the Northwind ETL process, a conceptual one
using BPMN and two implementations in Microsoft Integration Services and
in Pentaho Kettle. We described the differences between the three versions of
this ETL process, taking into account implementation considerations in the
two platforms chosen.
8.8 Bibliographic Notes
A classic reference for ETL is the topic by Kimball and Caserta [ 102 ]. Various
approaches for designing, optimizing, and automating ETL processes have
been proposed in the last few years. A survey of ETL technology can be
found in [ 219 ]. Simitsis et al. [ 221 ] represent ETL processes as a graph where
nodes match to transformations, constraints, attributes, and data stores and
edges correspond to data flows, inter-attribute relations, compositions, and
concurrent candidates. An approach for mapping conceptual ETL design
to logical ETL design was proposed in [ 188 ]. The topics [ 105 ]and 26 ]
describe in detail, respectively, Microsoft Integration Services and Pentaho
Data Integration or Kettle. An introduction to business process modeling,
andanoverviewofBPMN2.0areprovidedin[ 211 ]. This chapter is based
on previous work on using BPMN as a conceptual model for ETL processes,
performed by the authors and collaborators [ 45 - 48 ].
Although in this chapter we focused on Integration Services and Kettle,
other tools are available for designing and executing ETL processes, like
Oracle Data Integrator [ 80 ]orTalendOpenStudio 19 ]. However, all
existing tools provide their own language for specifying ETL processes. Their
languages differ considerably in many respects, in particular since they are
based on different paradigms and have different expression power.
8.9 Review Questions
8.1 What is a business process? Why do we need to model business
processes?
8.2 Describe and classify the main constructs of BPMN.
8.3 What is the difference between an exclusive and an inclusive gateway?
Search WWH ::




Custom Search