Databases Reference
In-Depth Information
Pros:
Scalable design for RDBMS and Big Data processing.
Modular data integration architecture.
Heterogeneous physical architecture deployment, providing best-in-class integration at the
data processing layer.
Metadata and MDM solutions can be leveraged with relative ease across the solution.
Cons:
Performance of the Big Data connector is the biggest area of weakness.
Data integration and query scalability can become complex.
Typical use case for this type of integration architecture can be seen in organizations where the
data needs to be integrated into analytics and reporting. Examples include social media data, textual
data, and semi-structured data like emails.
Data loading is isolated across the layers. This provides a foundation to create a robust data
management strategy.
Data availability is controlled to each layer and security rules can be implemented to each layer as
required, avoiding any associated overhead for other layers.
Data volumes can be managed across the individual layers of data based on the data type, the
life-cycle requirements for the data, and the cost of the storage.
Storage performance—Hadoop is designed and deployed on commodity architecture and the
storage costs are very low compared to the traditional RDBMS platform. The performance of the
disks for each layer can be configured as needed by the end user.
Operational costs—in this architecture the operational cost calculation has fixed and variable cost
components. The variable costs are related to processing and computing infrastructure and labor
costs. The fixed costs are related to RDBMS maintenance and its related costs.
Pitfalls to avoid:
Too much data complexity at any one layer of processing.
Executing large data exchanges between the different layers.
Incorrect levels of integration (at data granularity).
Applying too many transformation complexities using the connectors.
Big Data appliances
Data warehouse appliances emerged as a strong black-box architecture for processing workloads spe-
cific to large-scale data in the last decade. One of the extensions of this architecture is the emergence
of Big Data appliances. These appliances are configured to handle the rigors of workloads and com-
plexities of Big Data and the current RDBMS architecture.
Figure 10.8 shows the conceptual architecture of the Big Data appliance, which includes a layer
of Hadoop and a layer of RDBMS. While the physical architectural implementation can differ among
vendors like Teradata, Oracle, IBM, and Microsoft, the underlying conceptual architecture remains
the same, where Hadoop and/or NoSQL technologies will be used to acquire, preprocess, and store
Big Data, and the RDBMS layers will be used to process the output from the Hadoop and NoSQL
layers. In-database MapReduce, R, and RDBMS specific translators and connectors will be used in
the integrated architecture for managing data movement and transformation within the appliance.
 
Search WWH ::




Custom Search