Database Reference
In-Depth Information
Combining HDInsight with Your Business Processes
Big Data solutions open up new opportunities for turning data into meaningful information. They can also be used
to extend existing information systems to provide additional insights through analytics and data visualization. Every
organization is different, so there is no definitive list of ways you can use HDInsight as part of your own business
processes. However, there are four general architectural models. Understanding these will help you start making
decisions about how best to integrate HDInsight with your organization, as well as with your existing BI systems and
tools. The four different models are
A data collection, analysis, and visualization tool: This model is typically chosen for
handling data you cannot process using existing systems. For example, you might want to
analyze sentiments about your products or services from micro-blogging sites like Twitter,
social media like Facebook, feedback from customers through email, web pages, and so forth.
You might be able to combine this information with other data, such as demographic data
that indicates population density and other characteristics in each city where your products
are sold.
A data-transfer, data-cleansing, and ETL mechanism: HDInsight can be used to extract
and transform data before you load it into your existing databases or data-visualization tools.
HDInsight solutions are well suited to performing categorization and normalization of data,
and for extracting summary results to remove duplication and redundancy. This is typically
referred to as an Extract, Transform, and Load (ETL) process.
A basic data warehouse or commodity-storage mechanism: You can use HDInsight to store
both the source data and the results of queries executed over this data. You can also store
schemas (or, to be precise, metadata) for tables that are populated by the queries you execute.
These tables can be indexed, although there is no formal mechanism for managing key-based
relationships between them. However, you can create data repositories that are robust and
reasonably low cost to maintain, which is especially useful if you need to store and manage
huge volumes of data.
An integration with an enterprise data warehouse and BI system: Enterprise-level data
warehouses have some special characteristics that differentiate them from simple database
systems, so there are additional considerations for integrating with HDInsight. You can also
integrate at different levels, depending on the way you intend to use the data obtained from
HDInsight.
Figure 1-5 shows a sample HDInsight deployment as a data collection and analytics tool.
 
Search WWH ::




Custom Search