Database Reference
In-Depth Information
and joins can be slow. The way that databases gain consistency is typically through the use of
transactions, which require locking some portion of the database so it's not available to other
clients. This can become untenable under very heavy loads, as the locks mean that competing
users start queuing up, waiting for their turn to read or write the data.
We typically address these problems in one or more of the following ways, sometimes in this
order:
▪ Throw hardware at the problem by adding more memory, adding faster processors, and up-
grading disks. This is known as vertical scaling. This can relieve you for a time.
▪ When the problems arise again, the answer appears to be similar: now that one box is maxed
out, you add hardware in the form of additional boxes in a database cluster. Now you have
the problem of data replication and consistency during regular usage and in failover scenari-
os. You didn't have that problem before.
▪ Now we need to update the configuration of the database management system. This might
mean optimizing the channels the database uses to write to the underlying filesystem. We
turn off logging or journaling, which frequently is not a desirable (or, depending on your situ-
ation, legal) option.
▪ Having put what attention we could into the database system, we turn to our application. We
try to improve our indexes. We optimize the queries. But presumably at this scale we wer-
en't wholly ignorant of index and query optimization, and already had them in pretty good
shape. So this becomes a painful process of picking through the data access code to find any
opportunities for fine tuning. This might include reducing or reorganizing joins, throwing out
resource-intensive features such as XML processing within a stored procedure, and so forth.
Of course, presumably we were doing that XML processing for a reason, so if we have to
do it somewhere, we move that problem to the application layer, hoping to solve it there and
crossing our fingers that we don't break something else in the meantime.
▪ We employ a caching layer. For larger systems, this might include distributed caches such as
memcached, EHCache, Oracle Coherence, or other related products. Now we have a consist-
ency problem between updates in the cache and updates in the database, which is exacer-
bated over a cluster.
▪ We turn our attention to the database again and decide that, now that the application is built
and we understand the primary query paths, we can duplicate some of the data to make it
look more like the queries that access it. This process, called denormalization, is antithetical
to the five normal forms that characterize the relational model, and violate Codd's 12 Com-
mandments for relational data. We remind ourselves that we live in this world, and not in
some theoretical cloud, and then undertake to do what we must to make the application start
responding at acceptable levels again, even if it's no longer “pure.”
Search WWH ::




Custom Search