Database Reference
In-Depth Information
for values in an SQL query and only sends to the output stream the rows
for which a matching value is found. A dummy task is needed in Kettle for
performing the union between the step that copies the column Region into
a new column State and the step that filters the rows for which a value for
State was found in the first lookup. For the same reasons explained above,
the transformation in Fig. 8.41 differs from the corresponding one in Fig. 8.33
in that the dummy step sends the input rows to all subsequent lookup tasks.
The SQL query used in the lookup step that looks for CityKey with StateName
and CountryName is as follows:
SELECT C.CityKey
FROM City C JOIN State S ON C.StateKey = S.Statekey
JOIN Country T ON S.CountryKey = T.CountryKey
WHERE ? = CityName AND ? = StateName AND ? = CountryName
The ' ? ' symbols are used as parameters that are replaced in the flow with
the values of City , State ,and Country that are read from the Customers table.
The SQL queries for the other lookups are similar.
The implementation for the Supplier dimension table is similar so we omit
its description.
Fig. 8.42 Load of the Territories bridge table
Figure 8.42 shows the transformation that loads the Territories bridge table.
The corresponding data flow for Integration Services is shown in Fig. 8.34 .
The flow starts by obtaining the assignment of employees to territories from
the Northwind database using an SQL query. In the case of Kettle, there is
no step that removes the trailing spaces in the TerritoryDescription column.
This was taken into account in the SQL query of the subsequent lookup step
as follows:
SELECT CityKey
FROM City
WHERE TRIM(?) = CityName
Note that a similar solution could have been applied in Integration Services.
After the lookup of the CityKey , Kettle requires a sort process prior to
the unique rows step. Compared with the corresponding implementation in
Search WWH ::




Custom Search