Databases Reference
In-Depth Information
The whole point behind the bigness of big data making solutions complex is entirely
not true. Consider the scenario where even 50 GB of data can be said to be big data if the
structure is too complex for a normal RDBMS to handle . In that context, what would we
call small data? Small data are simple homogenous data structures, e.g. structured data,
strings, dates, times, and all the data we used to feed into the traditional data warehouses .
Theoretically speaking, a large collection of these small data can eventually become
big data.
In any enterprise data management scenario, we will see a combination of small
data and big data and there are two application architecture approaches that are widely
followed to implement BDW solutions depicted in Figure 5-3 .
Figure 5-3. Architecture patterns involving Hadoop and RDBMS
The first approach is to have Hadoop as a data ingestion and data
processing platform before the data flow reaches the RDBMS.
The second approach is to have Hadoop as data management
platform in parallel to the RDBMS.
In application architecture approach A, Hadoop is used primarily as a data ingestion
mechanism and a staging area. In contrast to the normal file system or relational staging
area where we can only keep a certain amount of data, using Hadoop as a staging layer
we can now keep all the historical data. Apart from historical data, the main advantages
of using Hadoop for the staging area are the flexibility to ingest any type of data and
also to address scale issues. From the Hadoop staging area we can use specialized data
integration tools to move data into RDBMS.
In application architecture approach B, Hadoop is primarily used to store and
process data showing big data characteristics, whereas RDBMS is used to store and
process “small data.” However, both these data stores are used in conjunction to finally
make the information available to the consumers.
 
Search WWH ::




Custom Search