To Govern or Not to Govern: Governance in a Big Data World - Harness the Power of Big Data

Database Reference

In-Depth Information

low-latency integration requirements. Data Replication has sophisticated

functionality for high-speed data movement, conflict detection, system moni-

toring, and a graphical development environment for designing integration

tasks. Furthermore, it's integrated with the set of IBM PureData Systems for

high-speed data loading/synchronization, and also with Information Server

to accumulate changes and move data in bulk to a target system.

Federation involves accessing data that's stored in federated repositories

through a federated query. This is often used to retrieve data from multiple

systems or to augment data that's stored in one system with information

from another system. IBM InfoSphere Federation Server (Federation Server)

accesses and integrates data from a diverse set of structured data sources,

regardless of where they reside. It enables hybrid data warehousing by join-

ing data from multiple repositories, and also exposing information as a ser-

vice (IaaS) via InfoSphere Information Services Director. From a federation

perspective, it's important to be able to federate search (as the Data Explorer

technology we outlined in Chapter 7) across your enterprise assets, as well

as with a query API such as SQL. In the future, Federation Server may inte-

grate with Data Explorer to provide structured data source search and query

within an overall Big Data (structured and unstructured) federated search

and discovery.

We believe that organizations shouldn't try to solely deliver enterprise

integration with Hadoop; rather they should leverage mature data integra-

tion technologies to help speed their deployments of Big Data, whenever that

makes sense. There's a huge gap between a general-purpose tool and a pur-

pose-built one, not to mention that integration involves many aspects other

than the delivery of data, such as discovery, profiling, metadata, and data

quality. We recommend that you consider using IBM Information Server

with your Big Data projects to optimize the loading (via bulk load or replica-

tion) of high-volume structured data into a data warehouse, and extending it

with federation when required; the loading of structured or semistructured

data into Hadoop; and the collecting of information that's filtered and ana-

lyzed by stream analytics. You can then load the data into a relational system

(such as a data warehouse), federate queries across relational databases as

part of Big Data federation and navigation, and replicate data sources to a

Hadoop cluster or other data warehouse.

Search WWH ::

Custom Search

Home