Database Reference
In-Depth Information
Figure 1-2. A general data warehouse system
As you can see in Figure 1-2 , ETL (extraction, transformation, and loading of the data) feeds arrive at the
staging schema of the warehouse and are loaded into their current raw format in staging area tables. The data is
then transformed and moved to the data vault, which contains all the data in the repository. That data might be
filtered, cleaned, enriched, and restructured. Lastly, the data is loaded into the BI, or Business Intelligence, schema
of the warehouse, where the data could be linked to reference tables. It is at this point that the data is available for
the business via reporting tools and adhoc reports. Figure 1-2 also illustrates the scheduling and monitoring tasks.
Scheduling controls when feeds are run and the relationships between them, while monitoring determines whether
the feeds have run and whether errors have occurred. Note also that scheduled feeds can be inputs to the system, as
well as outputs.
the data movement flows from extraction from raw sources, to loading, to staging and transformation, and to
the data vault and the bi layer. the acronym for this process is elt (extract, load, transfer), which better captures what is
happening than the common term etl.
Note
Many features of this data warehouse system can scale up to and be useful in a big data system. Indeed, the
big data system could feed data to data warehouses and datamarts. Such a big data system would need extraction,
loading, and transform feeds, as well as scheduling, monitoring, and perhaps the data partitioning that a data
warehouse uses, to separate the stages of data processing and access. By adding a big data repository to an IT
architecture, you can extend future possibilities to mine data and produce useful reports. Whereas currently you
might filter and aggregate data to make it fit a datamart, the new architecture allows you to store all of your raw data.
So where would a big data system fit in terms of other systems a large organization might have? Figure 1-3
represents its position in general terms, for there are many variations on this, depending on the type of company and
its data feeds.
 
 
Search WWH ::




Custom Search