Databases Reference
In-Depth Information
design and evolution of an EDW requires a well-conceived data management strategy
to bring together relevant data sources from various parts of the enterprise. The multi-
dimensional data resides in a comprehensive analysis- oriented data model. Reporting
strategies are then developed to leverage the data and then a data governance strategy is
implemented to manage and maintain the EDW as a valuable enterprise data asset.
While this EDW approach remains a standard practice, there are several factors like
cost, scalability, performance, and ability to handle any type of data, which are beginning
to show up as serious shortcomings in the traditional solutions. To a certain extent,
the cost and scalability concerns can be addressed by effective usages of commodity
hardware and storage solutions, but what about the other data types?
So, what additional considerations in the IT stack should be put in place
to take into account the big data types?
Let's first discuss a few fundamental concepts. Across the industry there is a growing
opinion that it's not just the volume aspects associated with data that make it difficult to
manage. Rather, it is the collection of different types of data, when put together, cannot
be processed using conventional methods. Then it becomes a big data situation. What
are those “conventional methods”? And why did they suddenly become inadequate? To
answer that, it is helpful to understand the type of problems we are trying to solve when
we take big data into account.
Right from the mainframe era to client server era, a major expectation from
enterprise IT was to ensure that transactional systems (e.g., online transaction
processing) ran efficiently, quickly, and consistently. These expectations influenced
many technologies and architectures to develop applications using proprietary
relational databases on proprietary monolithic servers with proprietary and monolithic
storage infrastructures.
Table 2-1. Various Representations of IT Stacks
Traditional IT
Web-Scale
Applications
Big Data Analytics
Initiatives
Scope
Mostly online
transactional
processing systems
E-commerce, web
sites, search engines
Web search, deep
analytics applications
Data
Characteristics
Relatively small
amount of highly
structured data of
high quality; small
number of users
Combination of
structured and
un-structured data;
millions of users
Massive amounts of
structured and un-
structured data which
is to be analyzed to
derive insights, spot
trends, etc. Accuracy
and insight rather than
precision is the key.
( continued )
 
 
Search WWH ::




Custom Search