Database Reference
In-Depth Information
warehouse. Putting all of this data within an infrastructure that is primarily
optimized for performance might not be economically viable. For this reason,
enterprises might want to optimize their analytics footprint and store the
less frequently accessed data on infrastructure that's optimized for price per
terabyte of storage, and leverage the higher performing infrastructure
as needed.
Since Hadoop's fault-tolerant distributed storage system runs on commodity
hardware, it could serve as a repository for this type of data. Unlike tape-
based storage systems that have no computational capability, Hadoop provides
a mechanism to access and analyze data. Because moving computation is
cheaper than moving data, Hadoop's architecture is better suited as a queryable
archive for Big Data. In fact, the number one use case that countless enter-
prise customers have described to us involves the architecture of “hot-cold”
data storage schemes between Netezza and Hadoop, where the most actively
used data is warehoused in Netezza, and everything else is archived in Hadoop.
Of course, in addition to the native connectors that are built into Hadoop,
IBM provides a whole information integration and governance platform
(which we cover in Chapters 10 and 11) that can help to facilitate and govern
this process.
Unstructured Data Analysis
Relational data warehouses provide limited capabilities for storing complex
data types and unstructured data. What's more, performing computations on
unstructured data through SQL can be quite cumbersome and limited. Hadoop's
ability to store data in any format and analyze that data using a procedural
programming paradigm, such as MapReduce, makes it well suited for storing,
managing, and processing unstructured data. You could use Hadoop to
preprocess unstructured data, extract key features and metadata, and then
load that data into your Netezza data warehouse for further analysis.
Customers' Success Stories:
The Netezza Experience
It's a fact: hundreds of organizations have escaped the frustrations of their
first-generation data warehouses by replacing older database technologies
with IBM Netezza's data warehousing and analytic appliances. The systems
 
Search WWH ::




Custom Search