Data Modeling Approaches for Big Data and Analytics Solutions - Big Data Imperatives

Databases Reference

In-Depth Information

The following concepts are critical to understand how column databases work:

•

Column family

•

Super columns

•

Column

You need to define the schema for tables in relational databases; however, the only

thing that you define in a column family is the name and the key sort options (there is

no schema).

•

Column families. A column family is how the data is stored on the

disk. All the data in a single column family will sit in the same file

(actually, set of files, but that is close enough). A column family

can contain super columns or columns.

•

A super column is a dictionary; it is a column that contains other

columns (but not other super columns).

•

A column is a tuple of name, value, and timestamp.

It is important to understand that schema design in a column family database

(CFDB) is of great importance; if you don't build your schema right, you literally can't

get the data out. CFDB usually offers one of two forms of queries, either by key or by

key range. A CFDB is meant to be distributed, and the key determines where the actual

physical data would be located. Data is stored based on the sort order of the column

family, and you have no real way of changing the sorting (except choosing between

ascending or descending). The sort order, unlike in a relational database, isn't affected by

the columns values but by the column names.

In order to clarify the concepts of column families and the type of problems they

help solve, let's look at an example.

Imagine you have a database that contains census data. The person table

(Figure 6-10 ) has one row for each person who participated in and would probably be

keyed by a unique key. All singleton attributes such as date of birth, gender, address and

so forth would exist in this table. Some repeating attributes like work history wouldbe

normalized out into related tables. Depending upon the size of the sample, a census may

take in hundreds of millions of people, and would look something like Figure 6-10.

Search WWH ::

Custom Search

Home