Database Reference
In-Depth Information
for the Northwind ETL process (we will explain in detail this process later
in this chapter). The figure shows some of the subprocesses that load the
Product
dimension table, the
Time
dimension table, and the
Sales
fact table
(represented as compound activities with subprocesses); it also assumes their
distribution between
Server 1
and
Server 2
. Each one of these servers is
considered as a lane contained inside the pool of data warehouse servers.
We can also see that a swimlane denoted
Currency Server
contains a web
service that receives an input currency (like US dollars), an amount, and an
output currency (like euros) and returns the amount equivalent in the output
currency. This could be used in the loading of the
Sales
fact table. Thus, flow
messages are exchanged between the
Sales Load
activity and the
Exchange
Rate
task which is performed by the web service. These messages go across
both swimlanes.
Data tasks
represent activities typically carried out to manipulate data,
such as input data, output data, and data transformation. Since such
data manipulation operations occur
within
an activity, data tasks can be
considered as being at a lower abstraction level than control tasks. Recall that
arrows in a data task represent not only a precedence relationship between
its activities but also the flow of data records between them.
Data tasks can be classified into row and rowset operations.
Row
operations
apply transformations to the data on a row-by-row basis. In
contrast,
rowset operations
deal with a set of rows. For example, updating
the value of a column is a row operation, while aggregation is a rowset
operation. Data tasks can also be classified (orthogonally to the previous
classification) into
unary
or
n
-ary data tasks
, depending of the number of
input flows.
a
b
c
d
Convert
Column
Input Data
Insert Data
Add Column
Column:
SalesAmount =
D.UnitPrice * (1-Discount) *
Quanty
Columns:
Date: Date
DayNbWeek: Smallint
Database:
NorthwindDW
Table:
Time
Mappings:
TimeKey->OrderDateKey
Opons:
Append
File:
Time.xls
Type:
Excel
Fig. 8.11
Unary row operations. (
a
) Input data. (
b
) Insert data. (
c
)Addcolumn.
(
d
)Convertcolumn
Figure
8.11
shows examples of unary row operations:
Input Data
,
Insert
Data
,
Add Column
,and
Convert Column
. Note the annotations linked to the
tasks by means of association flows. The annotations contain metadata that
specify the parameters of the task. For example, in Fig.
8.11
a, the annotation
tells that the data is read from an Excel file called
Time.xls
. Similarly, the
annotation in Fig.
8.11
b tells that the task inserts tuples in the table
Time
Search WWH ::
Custom Search