Databases Reference
In-Depth Information
to shrink and data volumes continue to grow putting more stress on batch integration
performance. Real-time integration involves low-latency integration and is often used
in conjunction with complex event processing to enable real-time reporting and analysis.
Federation is a completely different approach: it makes use of data through federated
queries.
These three styles of integration should not be independent from one another. They
should share a common foundation that establishes consistency in data. The process
should be governed by enterprise data management principles such as data profiling,
data quality assurance, improving the accuracy and completeness of data, tracking
its lineage, and exposing enterprise metadata to facilitate integration. By applying a
common approach to all three styles of integrations you can build a common
foundation for information trust with common rules for data quality, metadata,
lineage, and governance.
The data integration styles discussed above and the data characteristics go hand
in hand in any enterprise data management scenario. For example, supplying trusted
information to a data warehouse will require bulk data integration; but for specific
reporting needs it may also need real-time integration, and potentially even federation
to access other data sources. Building and managing a single view with MDM will again
require bulk integration to populate MDM, real-time integration both to and from the
MDM system, and federation to augment MDM's business services to blend data stored
within MDM and data stored in other source systems.
While master data management approaches and implementation best practices
have been around for some time, implications of MDM on big data platforms is relatively
new. Big data is characterized by massive volumes, its high frequency, the variety of
less structured data sources such as e-mail, sensors, smart meters, social networks, and
weblogs, and the need to analyze vast amounts of data to determine value to improve
upon management decisions.
Is MDM ready for Big Data Platforms?
A pertinent question always comes up: is MDM ready for big data? This question needs to
be understood in the context of storage as well. In the traditional MDM implementations,
you will see a MDM repository storing the master entities and operating under the
defined MDM governance processes. In the traditional implementation approach,
MDM is meant to be an operational, structured repository of key enterprise data entities:
customers, households, products, locations, and many others.
However, in big data scenario, MDM isn't meant to be a big data repository, as it
will never be able to store all social media data, transactional data, behavior data, etc.
In big data scenarios, it is already evident and there will be more and more use cases that
require MDM to integrate with variety of data sources that are not clearly defined.
Applying master data management to big data may be less about MDM and more
about a paradigm shift in how we think about and use MDM. Although there are different
ways to approach MDM, it's often seen as a repository for master data. All the data is
dumped into MDM for sorting, cleansing, and achieving that mythical version of the truth.
 
Search WWH ::




Custom Search