Databases Reference
In-Depth Information
• Hierarchy Validation Data Profiling: Hierarchy validation data pro-
filing validates the hierarchies for aggregations. Hierarchy dimen-
sions define the relationships between attributes. For example,
geography has a hierarchy. In the geography dimension, we will find
country, state, county, and city. A city will be located in a county, a
county within a state, and a state will be in a country. Often different
groups want to see different hierarchies. For example, they might
want to see total sales being aggregated or rolled up differently.
The organization might have had a major acquisition. The different
groups might want to have a hierarchy with and without the major
acquisition to compare apples to apples.
• Data Enrichment Data Profiling: Data enrichment data profiling val-
idates that the process of adding or supplementing data from other
sources is correct. For example, we have a physician's demographic
information. We use this demographic information to search and
obtain his/her license number.
• Matching Validation Data Profiling: Matching validation data pro-
filing is validating that the matching process is correct. For example,
we would verify that Michael Smith, Mike Smith, and Mikey Smith
are the same person in the Master Patient Index.
• Dependency Data Profiling: Dependency data profiling investigates
the relationships between columns. If we discover that one column
is completely dependent upon another column, we may go back and
design the data model or verify that the data is correct. For example,
birth_date is always a lower date than death_date. So, if we have a
person that was dead before they were born, we have a data error.
There are many data profiling tools, one of which is Oracle® Datanomics.
DATA LIFE CYCLE
Defining data life cycle rules is an important part of data governance.
Data life cycle rules answer questions, such as:
• What is the length of time that the data needs to be accessible?
• If the data is archived, what is the service level agreement for the
amount of time in which we can retrieve the data back?
 
Search WWH ::




Custom Search