Database Reference
In-Depth Information
NOSQL VERSUS RDBMS: WHAT'S THE DIFFERENCE, WHAT'S THE
POINT?
NoSQL databases and relational databases share the same basic goals: to store and retrieve data
and to coordinate changes. The difference is that NoSQL databases trade away some of the capab-
ilities of relational databases in order to improve scalability. In particular, NoSQL databases typ-
ically have much simpler coordination capabilities than the transactions that traditional relational
systems provide (or even none at all). The NoSQL databases usually eliminate all or most of SQL
query language and, importantly, the complex optimizer required for SQL to be useful.
The benefits of making this trade include greater simplicity in the NoSQL database, the ability to
handle semi-structured and denormalized data and, potentially, much higher scalability for the
system. The drawbacks include a compensating increase in the complexity of the application and
loss of the abstraction provided by the query optimizer. Losing the optimizer means that much of
the optimization of queries has to be done inside the developer's head and is frozen into the ap-
plication code. Of course, losing the optimizer also can be an advantage since it allows the deve-
loper to have much more predictable performance.
Over time, the originally hard-and-fast tradeoffs involving the loss of transactions and SQL in re-
turn for the performance and scalability of the NoSQL database have become much more nu-
anced. New forms of transactions are becoming available in some NoSQL databases that provide
much weaker guarantees than the kinds of transactions in RDBMS. In addition, modern imple-
mentations of SQL such as open source Apache Drill allow analysts and developers working with
NoSQL applications to have a full SQL language capability when they choose, while retaining
scalability.
Until recently, the standard approach to dealing with large-scale time series data has been to
decide from the start which data to sample, to study a few weeks' or months' worth of the
sampled data, produce the desired reports, summarize some results to be archived, and then
discard most or all of the original data. Now that's changing. There is a golden opportunity to
do broader and deeper analytics, exploring data that would previously have been discarded.
At modern rates of data production, even a few weeks or months is a large enough data
volume that it starts to overwhelm traditional database methods. With the new scalable
NoSQL platforms and tools for data storage and access, it's now feasible to archive years of
raw or lightly processed data. These much finer-grained and longer histories are especially
valuable in modeling needed for predictive analytics, for anomaly detection, for back-testing
new models, and in finding long-term trends and correlations.
As a result of these new options, the number of situations in which data is being collected as
time series is also expanding, as is the need for extremely reliable and high-performance time
Search WWH ::




Custom Search