Database Reference
In-Depth Information
store a wide variety of data types, process analytic queries through MapReduce,
and predictably scale with increased data volumes is a very attractive solu-
tion set for Big Data analytics. Netezza's ability to embed complex non-SQL
algorithms in the processing elements of its MPP stream, without the typical
intricacies of Hadoop programming, enables low-latency access to high
volumes of structured data that can be integrated with a wide variety of
enterprise BI and ETL tools. These principles make Netezza the ideal platform
for the convergence of data warehousing and advanced analytics. To leverage
the best of both worlds, Netezza offers different connectivity solutions with a
Hadoop cluster (we cover Hadoop in Chapter 5).
Because the IBM Big Data platform is so flexible, we thought it would be
worthwhile to discuss some typical scenarios that use Hadoop in conjunction
with Netezza.
Exploratory Analysis
Sometimes an organization encounters a new source of data that needs to be
analyzed. They might have little to no knowledge about the format of this
new data source, the data types that it contains, or the relationships it encap-
sulates. For example, suppose that the marketing department has launched a
new multichannel campaign and wants to integrate responses from Face-
book and Twitter with other sources of data that they might have. If you've
never used Facebook or Twitter APIs, or are not familiar with their data feed
structures, it might take some experimentation (data discovery) to figure out
what to extract from this feed and how to integrate it with other sources.
Hadoop's ability to process data feeds for which a schema has not yet been
defined is ideal in this scenario. So if you want to explore relationships within
data, especially in an environment where the schema is constantly evolving,
Hadoop provides a mechanism by which you can explore the data until a
formal, repeatable ETL process is defined. After that process is defined and
structure is established, the data can be loaded into Netezza for standardized
reporting or ad-hoc analysis.
Hadoop Is the New Tape: The Queryable Archive
Big Data analytics tend to bring large volumes of data under investigation.
Quite often we find that a significant percentage of this data might not be of
interest on a regular basis. Such data might be of an historical nature or very
granular data that has subsequently been summarized within the data
Search WWH ::




Custom Search