Database Reference
In-Depth Information
Lifecycle management is also essential to ensure legal compliance with data
retention and protection regulations, and to be able to audit compliance
with data retention policies. In the Big Data world, low-cost engines like
Hadoop offer the opportunity of a low-cost alternative for storing online
archives to host colder data. While transferability between platforms is
improving, the business process of lifecycle management and archiving
is a separate challenge (in other words, Hadoop may be a low-cost platform,
but there are many more capabilities required to truly manage data growth).
That's where Optim comes in. It manages the lifecycle and the archiving
process—discovering and profiling Big Data and tracking lifecycle mile-
stones (when to archive), automatically archiving Big Data from data ware-
houses and transactional databases, providing visibility and the ability to
retrieve and restore data if required, ensuring the immutability of archived
data to prevent data errors, and complying with legal requirements for
data retention and auditability. Optim can store archived data in a highly
compressed relational database, an archive file on a file system, and that
file may be loaded onto a Hadoop file system. And that latter integration
point between Optim and Hadoop is a key one, by placing an archive file
onto a Hadoop system it provides low-cost storage, while also allowing
data to be analyzed for different purposes, thus deriving insight from
the archived files. It's not wonder why some of us refer to Hadoop as the
new tape!
Test data management should be a definite consideration when implement-
ing a Big Data project, in order to control test data costs and improve overall
implementation time. Optim TDM automatically generates and refreshes test
data for Big Data systems such as data warehouses, and it has an optimized
integration with the IBM PureData System for Analytics. Optim generates
right-sized test data sets, for example, a 100 TB production system may require
only 1 TB for user acceptance testing. InfoSphere Optim also ensures that sensi-
tive data is masked for testing environments. It will generate realistic data for
testing purposes (for example, changing Jerome Smith at 123 Oak Street to
Kevin Brown at 231 Pine Avenue), but will protect the real data from potential
loss or misuse. The possibility of data loss becomes even more real in the new
era of Big Data. More systems and more data lead to more test environments
and more potential for data loss. Optim TDM also has a tremendous cost
Search WWH ::




Custom Search