Databases Reference
In-Depth Information
entry (node) l will point to the smallest jump index entry (node) l such that
l +2 i
l +2 i +1 . As a result, lookups can be done in O (log 2 N ) time,
where N =2 p . The jump index can be kept in WORM storage by storing
each node of the index as a separate file, and appending new jump pointers
to the end of the file.
As with B+ trees, one can tune the fanout of jump tree nodes for better
space and time eciency. Figure 5(b) shows a block-structured jump index,
in which p posting entries are stored together in blocks of size L . Pointers are
associated with blocks, rather than with every entry. Finally, jump pointers
are calculated using powers of B rather than powers of two, where p
l
B .
More specifically, ( B
1) log B ( N ) pointers are stored with every block. Each
pointer is uniquely identified by a pair ( i, j ), where 0
i< log B ( N )and
1
j<B . The pointers are set up as follows: let l 1 be the largest document
ID stored in block b . The ( i, j ) pointer in b points to the block containing the
smallest document ID s such that
l 1 + jB i
s<l 1 +( j +1) B i .
In the trustworthy indexing approaches described above, an adversary can
insert malicious entries into the index. For example, Mallory can insert a docu-
ment ID into the posting list for a term that does not appear in the document,
or can append an inappropriate jump pointer to a jump index node. Malicious
entries fall into two categories: those that cause subsequent legitimate inser-
tions to fail, and those that will only be noticed when a lookup operation finds
a dangling pointer or returns a record that does not match the query. Both
events will draw immediate unwanted attention to the attack, and neither
prevents all matching records from being returned in the answer to a query.
Thus these attacks are conspicuous and ineffective at hiding information.
If Mallory gains physical access to storage, he may tamper with the index
contents. If we have trusted hardware that can periodically sign portions of the
index (such as the secure coprocessor discussed in the previous section), then
any discrepancy between the signature and the current index contents can
be detected. Under such circumstances, the indexing system should support
partial recovery and fault isolation [42], as it would be very expensive to
regenerate the entire index from the data.
7 Trustworthy Migration
Storage systems technology evolves rapidly. It is impractical to store records
on a single server for decades, as the server will become obsolete and too
expensive to maintain. Organizations themselves may also evolve, via merg-
ers, spin-offs, and reorganizations that require records to be copied or moved.
When records must be moved, the migration process needs to be trustwor-
thy; that is, it should be possible to verify that the migration was completed
appropriately, even if a superuser adversary performed the migration.
Search WWH ::




Custom Search