Database Reference
In-Depth Information
for values
in an SQL query
and
only
sends to the output stream the rows
for which a matching value is found. A dummy task is needed in Kettle for
performing the union between the step that copies the column
Region
into
a new column
State
and the step that filters the rows for which a value for
State
was found in the first lookup. For the same reasons explained above,
the transformation in Fig.
8.41
differs from the corresponding one in Fig.
8.33
in that the dummy step sends the input rows to all subsequent lookup tasks.
The SQL query used in the lookup step that looks for
CityKey
with
StateName
and
CountryName
is as follows:
SELECT C.CityKey
FROM City C JOIN State S ON C.StateKey = S.Statekey
JOIN Country T ON S.CountryKey = T.CountryKey
WHERE ? = CityName AND ? = StateName AND ? = CountryName
The '
?
' symbols are used as parameters that are replaced in the flow with
the values of
City
,
State
,and
Country
that are read from the
Customers
table.
The SQL queries for the other lookups are similar.
The implementation for the
Supplier
dimension table is similar so we omit
its description.
Fig. 8.42
Load of the
Territories
bridge table
Figure
8.42
shows the transformation that loads the
Territories
bridge table.
The corresponding data flow for Integration Services is shown in Fig.
8.34
.
The flow starts by obtaining the assignment of employees to territories from
the Northwind database using an SQL query. In the case of Kettle, there is
no step that removes the trailing spaces in the
TerritoryDescription
column.
This was taken into account in the SQL query of the subsequent lookup step
as follows:
SELECT CityKey
FROM City
WHERE TRIM(?) = CityName
Note that a similar solution could have been applied in Integration Services.
After the lookup of the
CityKey
, Kettle requires a sort process prior to
the unique rows step. Compared with the corresponding implementation in
Search WWH ::
Custom Search