Databases Reference
In-Depth Information
Consistency: Consistency is a critical design consideration. Immediate consistency
means that as soon as data has been updated, any other query will see the updated value.
Eventual consistency means that changes to data will not be uniformly visible to all
queries for some period of time. Some queries may see the earlier value while others see
the new or updated value.
Consistency is important to most OLTP systems because inconsistent query results
could lead to serious problems. For example, if a bank account is emptied by one
withdrawal, it shouldn't be possible to withdraw more funds. If the banking withdrawal
application is designed for eventual consistency you can very well imagine the
consequences - it might be possible for two simultaneous withdrawals, each taking the
full balance out of the account, not a desirable state for the bank.
There are cases where immediate consistency is not critical and eventual consistency
is actually a desirable state, as it offers better performance and scalability characteristics,
particularly for large scale systems running in a distributed hardware environment like
the cloud. For example, in many consumer-facing web applications like e-commerce
applications, where the listing of products needs to be consistent with the actual inventory,
you can still go ahead with the transaction; later on, products listing can be made consistent
with products availability.
Updatability: Data may be changeable or it may be permanent. If an application
never updates or deletes data then it is possible to optimize the database design and
improve both performance and scalability.
Event streams, such as log data or web tracking activity are examples of data that
by its nature does not have updates. Events generate data, systems capture the data and
analyze the implications, and the data itself does not undergo any change at all. Outside
of event streams, the most common scenarios for write-once data are in BI and analytics
workloads, where data is usually loaded once and queried many times thereafter.
A number of BI and analytic databases assume that updates and deletes are rare and
use very simple mechanisms to control them. Putting a workload with a constant stream of
updates and deletes onto one of these databases will lead to query performance problems
because that workload is not part of their primary design. The same applies to some NoSQL
data stores that have been designed as append-only data stores to handle extremely high
rates of data loading. They can write large volumes of data quickly, but once written the
data can't be changed. Instead, it must be copied, modified, and written a second time.
Data Types: Relational databases operate on tables of data, but not all data is
tabular. Data structures can be hierarchies, networks, documents, or even nested inside
one another. If the data is hierarchical then it must be flattened into different tables before
it can be stored in a relational database. This isn't difficult, but it creates a challenge when
mapping between the database and a program that needs to retrieve the data.
Response Time: Response time is measured when you execute a query or
transaction and the time it takes to return the result of the operation. The challenge with
fast response time for queries is the volume of data that must be read, which is itself also
a function of the complexity of the query. Many solutions, like OLAP databases, focus on
pre-staging data so the query can simply read summarized or pre-calculated results.
If a query requires no joins it can be very fast, which is how some NoSQL databases satisfy
extremely low latency queries.
Response time for writes is similar, with the added mechanism of eventual
consistency. If a database is eventually consistent, it's possible to provide a higher degree
 
Search WWH ::




Custom Search