Database Reference
In-Depth Information
(remotely similar to column families, but don't get stuck on the idea that column families equal
tables—they don't), and then you define the names of the columns that will be in each table.
There are a few good reasons not to go too far with the idea that a column family is like a rela-
tional table. First, Cassandra is considered schema-free because although the column families are
defined, the columns are not. You can freely add any column to any column family at any time,
depending on your needs. Second, a column family has two attributes: a name and a comparator.
The comparator value indicates how columns will be sorted when they are returned to you in a
query—according to long, byte, UTF8, or other ordering.
In a relational database, it is frequently transparent to the user how tables are stored on disk,
and it is rare to hear of recommendations about data modeling based on how the RDBMS might
store tables on disk. That's another reason to keep in mind that a column family is nota table.
Because column families are each stored in separate files on disk, it's important to keep related
columns defined together in the same column family.
Another way that column families differ from relational tables is that relational tables define only
columns, and the user supplies the values, which are the rows. But in Cassandra, a table can hold
columns, or it can be defined as a super column family. The benefit of using a super column fam-
ily is to allow for nesting.
For standard column families, which is the default, you set the type to Standard; for a super
column family, you set the type to Super.
When you write data to a column family in Cassandra, you specify values for one or more
columns. That collection of values together with a unique identifier is called a row. That row has
a unique key, called the rowkey, which acts like the primary key unique identifier for that row.
So while it's not incorrect to call it column-oriented, or columnar, it might be easier to under-
stand the model if you think of rows as containers for columns. This is also why some people
refer to Cassandra column families as similar to a four-dimensional hash:
[Keyspace][ColumnFamily][Key][Column]
We can use a JSON-like notation to represent a Hotel column family, as shown here:
Hotel {
key: AZC_043 { name: Cambria Suites Hayden, phone: 480-444-4444,
address: 400 N. Hayden Rd., city: Scottsdale, state: AZ, zip: 85255}
key: AZS_011 { name: Clarion Scottsdale Peak, phone: 480-333-3333,
address: 3000 N. Scottsdale Rd, city: Scottsdale, state: AZ, zip: 85255}
key: CAS_021 { name: W Hotel, phone: 415-222-2222,
address: 181 3rd Street, city: San Francisco, state: CA, zip: 94103}
key: NYN_042 { name: Waldorf Hotel, phone: 212-555-5555,
address: 301 Park Ave, city: New York, state: NY, zip: 10019}
}
Search WWH ::




Custom Search