Database Reference
In-Depth Information
lisions/#comments ). In short, the odds of collision with Checksums are substan-
tial, and you should not use the Checksum function for change detection.
So, what can you use?
Detection via Hashbytes
One good alternative to the Checksum function is the Hashbytes function. Like Check-
sum, Hashbytes provides value hashing for a string value or variable. Checksum re-
turns an integer value; Hashbytes returns a binary value. Checksum uses an internal al-
gorithm to calculate the hash; Hashbytes uses standard encryption algorithms. The
sheer number of values available to each function is one reason Hashbytes is a better
choice. Checksum's int data type can return +/-231 values, whereas Hashbytes can re-
turn +/-2127 values for MD2, MD4, and MD5 algorithms and +/-2159 values for SHA
and SHA1 algorithms.
Brute Force Detection
Believe it or not, a “brute force” value comparison between sources and destinations
remains a viable option for change detection. How does it work? You acquire the des-
tination values by way of either a second Source component or a Lookup transform in
the SSIS Data Flow task. You match the rows in the source and destination by using an
alternate (or business) key—a value or combination of values that uniquely identifies
the row in both source and destination—and then compare the non-key column values
in the source row to the non-key values in the destination row.
Remember, you are attempting to isolate changes. It is assumed that you have sep-
arated the new rows—data that exists in the source and not in the destination—and per-
haps you have even detected deleted rows that exist in the destination but are no longer
found in the source. Changed and unchanged rows remain. Unchanged rows are just
that: the alternate keys align, as does every other value in each source and destination
column. Changed rows, however, have identical alternate keys and one or more differ-
ences in the source and destination columns. Comparing the column values and ac-
counting for the possibility of NULL s remains an option.
Historical Load
Search WWH ::




Custom Search