Databases Reference
In-Depth Information
CHAPTER
12
Information Management and
Life Cycle for Big Data
The only things that evolve by themselves in an organization are disorder,
friction, and malperformance.
—Peter Drucker
INTRODUCTION
Managing information complexity as data volumes have exploded is one of the biggest challenges
since the dawn of information management. In traditional systems the problem has always been the
limitations of the RDBMS and the SAN, which over time have continued to be a bottleneck even with
the commoditization of hardware and storage. Part of the problem has been about effective informa-
tion management in terms of metadata and master data management, and part of the problem has
been the multiple different technologies that have been deployed in the industry to facilitate data pro-
cessing, each of which has its own formats. Fast forward and add Big Data to this existing scenario
and the problem compounds significantly. How do we deal with this issue, especially considering the
fact that Hadoop is being considered as a low-cost storage that can become the enterprise data reposi-
tory? This chapter deals with how to implement information life-cycle management principles to Big
Data and create a sustainable process that will ensure that business continuity is not interrupted and
data is available on demand.
Information life-cycle management
Information life-cycle management is the practice of managing the life cycle of data across an enter-
prise from its creation or acquisition to archival. The concept of information life-cycle management
has always existed as “records management” since the early days of computing, but the management
of records meant archival and deletion with extremely limited capability to reuse the same data when
needed later on. Today, with the advancement in technology and commoditization of infrastructure,
managing data is no longer confined to periods of time and is focused as a data management exercise.
Why manage data? The answer to this question lies in the fact that data is a corporate asset and
needs to be treated as such. To manage this asset, you need to understand the needs of the enter-
prise with regards to data life cycle, data security, compliance requirements, regulatory requirements,
auditability and traceability, storage and management, metadata and master data requirements, and
241
 
Search WWH ::




Custom Search