Databases Reference
In-Depth Information
and backups, and their associated transaction logs, will usually
enable us to recreate any state that the database has been in.
They will allow us to re-present six of
the nine temporal
categories we have identified. 3
The three categories that cannot be re-presented from
backups and logfiles are the three categories of future claims—
things we are going to make our databases say (unless we
change our minds) about what things once were like, or are like
now, or may be like in the future. Future claims often start out as
scribbled notes on someone's desk. But once inside the machine,
they exist in transaction datasets, in collections of data that are
intended, at some time or other, to be applied to the database
and become currently asserted data.
In the previous chapter, we called the eight categories of
data which are not current claims about the present, pipeline
datasets, collections of data that exist at various points along
the pipelines leading into production tables or leading out from
them. As physically separate from those production tables, these
collections of data are generally not immediately available for
business use. Usually, IT technical personnel must do some work
on these physical files or tables before a business user can query
them for information.
This takes time, and until the work is complete, the informa-
tion is not available. By the time the work is complete, the busi-
ness value of the information may be much reduced. This work
also has its costs in terms of how much time those technicians
must spend to prepare that data to be queried. In addition, even
without special requests for information in them, these physical
datasets, taken together, constitute a significant management
cost for IT.
With multiple points of rest in the pipelines leading into and
out of production database tables, there are multiple points at
which data can be lost. For example, data can be accidentally
deleted before any copies are made. For datasets in the inflow
pipelines, and which have not yet made it into the database
itself, the only recourse for lost data is to reacquire or recreate
the data. If prior datasets in the pipeline have already been
3 That's the idea, anyway. In reality, this “data of last resort” isn't always there when
we go looking for it. Backups and logfiles are rarely kept forever, so the data we need
may have been purged or written over. There will inevitably be occasional intervals
during which the system hiccupped, and simply failed to capture the data in the first
place. If the data is still available, it might not be in a readily accessible format because
of schema changes made after it was captured.
Search WWH ::




Custom Search