Information Technology Reference
In-Depth Information
The complicated part of merge-tree maintenance is the merge of C 0 with C 1 .
We must allow transactions to perform updates on C 0 , while the merge process is
in progress. The problem here is that transactions may perform updates on the key
range that the merge process has already scanned. Those updates are missed by the
merge process. The solution is to maintain a side file to capture the updates missed
by the merge process, as done in online index construction in Chap. 11 . When the
merge process has finished the construction of C 0 1 , the side file is processed, and a
new main-memory component C 0 0 is built from the updates recorded in the side file.
Finally, C 0 0
is atomically substituted for C 0 .
15.5
Log-Structured Databases
The conventional database organization separates the log from the application data
stored in the data pages, allowing the application data to be organized efficiently for
key-based clustered access, while the sequential log, which necessarily duplicates
the application data and even retains previous versions of data items, is only used for
recovery purposes. Write-ahead logging must be enforced so as to order properly
the log flushings and the database-page flushings. In the case of a write-intensive
application in which new data is often inserted once but rarely updated, and recent
data is read often, the database buffer may be full of modified pages all the time, and
a new page access may often cause some page from the buffer to be flushed onto
the database disk. In the presence of many short updating transactions, also the log
is flushed constantly.
A database organization called the log-structured database ,or log-only database ,
tries to alleviate the write bottleneck of write-intensive applications as follows. The
sequence of log records written by transactions for their update actions serves as the
sole repository of the tuples of the database. Such a log is necessarily a logical log,
because there are no data pages on which the log records should be applied. The log
is flushed from the log buffer onto disk, as before, when a transaction commits or
completes its rollback or when the log buffer becomes full or when a checkpoint is
taken, but otherwise the write-ahead logging protocol is not observed.
The logical log for transactions on relation r.X;V/ is interpreted as a single-
version database as follows. Given key x, the current version of the tuple with key
x is .x; v /, if the most recent log record for an update on key x is h T; I; x; v ;n 0 i ,
h T; D 1 ;x; v ;n 0 i , h T; W; x; u ; v ;n 0 i ,or h T; W 1 ;x; v ; u ;n 0 i . The database does not
contain a tuple with key x if the log does not contain any log record for an update
on x or if the most recent log record for an update on x is h T; D; x; v ;n 0 i or
h T; I 1 ;x;n 0 i .
For efficient access to single tuples, a log-structured database must come with a
dense index. The index record for a tuple .x; v / in r is .x; n/, if the most recent log
record for an update on x is h T; I; x; v ;n 0 i , h T; D 1 ;x; v ;n 0 i , h T; W; x; u ; v ;n 0 i ,or
h T; W 1 ;x; v ; u ;n 0 i ,andn is the LSN of the log record. We assume here that the
LSN s are physical addresses (byte offsets from the beginning of the log file).
Search WWH ::




Custom Search