Database Reference
In-Depth Information
One Package = One Unit of Work
With respect to data warehouse ETL, I've found that the best solution in most cases is
to break apart logical units of work into separate packages in which each package does
just one thing. By splitting up the workload, you can avoid a number of potential snags
and increase your productivity as an ETL developer. Some of the considerations for
smaller SSIS packages include the following:
Less time spent waiting on design-time validation: SQL Server
Data Tools has a rich interface that provides, among other things, a
near real-time evaluation of potential metadata problems in the SSDT
designer. If, for example, a table that is accessed by the SSIS package
is changed, the developer will be presented with a warning (or an er-
ror, if applicable) in SSDT indicating that metadata accessed by the
package has changed. This constant metadata validation is beneficial
in that it can help to identify potential problems before they are
pushed out for testing. There's also a performance cost associated
with this. The length of time required for validation increases as the
size of the package increases, so naturally keeping the packages as
small as practical will cut down on the amount of time you're drum-
ming your fingers on your desk waiting for validation to complete.
Easier testing and deployment: A single package that loads, say, ten
dimensions has a lot of moving parts. When you are developing each
of the dimensions within the package, there is no easy way to test just
one element of the package (apart from manually running it in the
SSDT designer, which isn't a completely realistic test for a package
that will eventually be deployed to the server). The only realistic test
for such a package is to test the entire package as a server-based exe-
cution, which may be overkill if you're only interested in one or two
changed properties. Further, it's not uncommon for organizations with
formal software testing and promotion procedures to require that the
entire thing be retested, not just the new or changed elements. By
breaking up operations into smaller units, you can usually make test-
ing and deployment less of a burden because you are only operating
on one component at a time.
Distributed development: If you work in an environment where you
are the only person developing SSIS packages, this is less of a con-
cern. However, if your shop has multiple ETL developers, those do-
Search WWH ::




Custom Search