Databases Reference
In-Depth Information
together to understand the choice, come to an agreement, and commit to
the direction.
New perspectives on the data: As part of building the data warehouse,
the data may be looked at in ways that have never been attempted before.
As the business and the marketplace evolve, new requirements also
emerge. These may result in identification of anomalies and challenges
in the underlying source systems. As new perspectives are defined, the
data may not yield meaningful or helpful results. For example, it may
take several iterations to define the business rule to identify long-term
customers. How many years are needed to consider someone a long-term
customer? What if the customer leaves and then comes back again — do
you count all of the years or just consecutive years?
Lack of business input: There are also many demands on the business
community. Business analysts who work with the data today are often
swamped with demands for data, reports, and analyses. In addition
to a full load of regular reporting, there are often ad hoc requests
that need attention. However, these analysts have the most knowledge
about how data is manipulated and which calculations are currently
used. Sometimes, the team must go through a series of meetings before
identifying the person who can really help.
A lot of hard work: Often, developing the ETL system takes a lot of time
and money simply because there is a lot of work that must be done. This
can be due to the complexity and volume of data or because of a lack of
well-structured and reliable data to begin with. Obviously, if the team is
pulling data from two data sources, the integration work is much smaller
than integrating data from ten different systems.
Insufficient data profiling: Whether the organization is using a sophisti-
cated data profiling tool or performing detailed data analysis by running
queries against the database, it is important to understand what data
is being stored. When the actual data contents have not been studied,
problems will arise during ETL development. The types of data problems
that emerge include finding that the data element is empty or that the
contents of a data element are not what was expected. Thorough study of
the data prior to designing and building the ETL system should obviate
these issues.
Indirect communication: Looking at the day-to-day work, questions
often arise for which the team needs input from another IT resource
or someone from the business community. If the project team must go
through a liaison to gain access to other IT or business people, this can
greatly lengthen the project schedule. While it may not seem like a big
Search WWH ::




Custom Search