Database Reference
In-Depth Information
Figure 10-33.
Schema window for trm1 job
I use the green plus icon at the bottom left of the window to manually specify the column names and data types.
I try to make the names meaningful so that they accurately represent the data they contain. I do not add any keys,
and my data does not contain null values, so I leave those fields blank. The schema for the raw prices file from the
tPigLoad_2 step shows just three columns, the last of which is the vehicle price (see Figure
10-34
).
Figure 10-34.
Three-column setup for trm1 job
The tPigMap_1 step takes the data from the loaded rawdata.txt and rawprices.txt files and combines them on
the manufacturer and model names. It then outputs the combined schema as a data flow called “out1,” shown in
Figure
10-35
. The arrows in Figure
10-35
show the flow of columns between the incoming and outgoing data sources.
The schemas at the bottom of the window show the incoming and outgoing data. Note: the example does not map all
the incoming columns, as in a later step I will filter out the columns that are not needed from the resulting data set.
Search WWH ::
Custom Search