Database Reference
In-Depth Information
discussed in Chapter 7 and Chapter 10A and have run an example query against the database.
Everything we see here is exactly the same as if the database was located on our own desktop
computer or local database server. This shows how easy it is to set up computing resources
hosted “in the cloud,” and there is no doubt that we will see more and more use of cloud
computing.
Big Data and the Not Only SQL Movement
We have used the relational database model and SQL throughout this topic. However, there is
another school of thought that has led to what was originally known as the NoSQL movement,
but now is usually referred as the Not only SQL movement. 10 It has been noted that most, but
not all, DBMSs associated with the NoSQL movement are nonrelational DBMSs and are often
known as structured storage . 11
A NoSQL DBMS is typically a distributed, replicated database, as described earlier in this
chapter, and used where this type of a DBMS is needed to support large datasets. For example,
both Facebook and Twitter use the Apache Software Foundation's Cassandra database (avail-
able at http://cassandra.apache.org ) .
Another type of implementation of a NoSQL database is one based on the use of the XML
document structures for data storage. An example is the open source dbXML (available at
www.dbxml.com ) . XML databases typically support the W3C XQuery ( www.w3.org/TR/xquery/ )
and XPath ( www.w3.org/TR/xpath/ ) standards.
Structured Storage
The basis for much of this development was two structured storage mechanisms developed by
Amazon.com ( Dynamo ) and Google ( Bigtable ). Facebook did the original development work
on Cassandra and then turned it over to the open source development community in 2008. As
noted earlier, Cassandra is now an Apache Software Foundation project.
A generalized structured storage system is shown in Figure 12-33. The structured
storage equivalent of a relational DBMS (RDBMS) table has a very different construction.
Although similar terms are used, they do not mean the same thing that they mean in a rela-
tional DBMS.
The smallest unit of storage is called a column , but is really the equivalent of an RDBMS
table cell (the intersection of an RDBMS row and column). A column consists of three ele-
ments: the column name , the column value or datum, and a timestamp to record when the
value was stored in the column. This is shown in Figure 12-33(a) by the LastName column,
which stores the LastName value Able.
Columns can be grouped into sets referred to as super columns . This is shown in
Figure 12-33(b) by the CustomerName super column, which consists of a FirstName column
and a LastName column and which stores the CustomerName value Ralph Able.
Columns and super columns are grouped to create column families , which are the struc-
tured storage equivalent of RDBMS tables. In a column family, we have rows of grouped
columns, and each row has a RowKey , which is similar to the primary key used in an RDBMS
table. However, unlike an RDBMS table, a row in a column family does not have to have
the same number of columns as another row in the same column family. This is illustrated
in Figure 12-33(c) by the Customer column family, which consists of three rows of data on
customers.
Figure 12-33(c) clearly illustrates the difference between structured storage column fami-
lies and RDBMS tables: Column families can have variable columns and data stored in each
row in a way that is impossible in an RDBMS table. This storage column structure is definitely
not in 1NF as defined in Chapter 2, let alone BCNF! For example, note that the first row has no
Phone or City columns, while the third row not only has no FirstName, Phone, or City columns,
but also contains an EmailAddress column that does not exist in the other rows.
10 For a good overview, see the Wikipedia article on NoSQL available at http://en.wikipedia.org/wiki/NoSQL .
11 See the Wikipedia article on structured storage at http://en.wikipedia.org/wiki/Structured_storage .
 
 
 
Search WWH ::




Custom Search