Database Reference
In-Depth Information
Cleansing Data in the Data Flow
The following section contains design patterns for cleansing data in the SSIS data flow
using the DQS Cleansing transform. There are two key issues to keep in mind when
cleansing data:
• The cleansing process is based on the rules within your knowledge base.
The better the cleansing rules are, the more accurate your cleansing pro-
cess will be. You may want to reprocess your data as the rules in your
knowledge base improve.
• Cleansing large amounts of data can take a long time. See the “Perform-
ance Considerations” section later in this chapter for patterns that you
can apply to reduce overall processing time.
Handling the Output of the DQS Cleansing Transform
The DQS Cleansing transform adds a number of new columns to the data flow (as de-
scribed earlier in this chapter). The way you'll handle the processed rows will usually
depend on the status of the row, which is set in the Record Status column. A
Conditional Split transformation can be used to redirect rows down the appropriate
data flow path. Figure 5-11 shows what the Conditional Split transformation would
look like with a separate output for each Record Status value. Table 5-4 contains
a list of possible status values.
 
 
Search WWH ::




Custom Search