Data Modeling Approaches for Big Data and Analytics Solutions - Big Data Imperatives

Databases Reference

In-Depth Information

3.

Planning for Concurrent Writes: In Cassandra, every row

within a column family is identified by the unique row key

(generally a string of unlimited length). Unlike the traditional

RDBMS primary key (which enforces uniqueness), Cassandra

doesn't impose uniqueness (duplicate row key insertion

might disturb the existing column structure). So care must be

taken to create the rows with unique row keys. Some of the

ways for creating unique row keys are as follows:

•

Surrogate/ UUID type of row keys

•

Natural row keys

Schema Migration Approach (Using ETL)

There are various ways of migrating data from relational data structures to Cassandra

structures, but if there are complex transformations and business rules involved it is

always advisable to leverage a data processing layer comprising ETL utilities (Figure 6-23 ).

Figure 6-23. Schema migration using ETL tools

By using in-built data loaders the processed data can be extracted to flat files (in

JSON format) and then uploaded to the Cassandra data structure's using these loaders.

Custom loaders could be fabricated in case of additional dispensation rules, which could

either deal the data from the processed store or the JSON files.

Search WWH ::

Custom Search

Home