Information Technology Reference
In-Depth Information
following the greatest common prefix. Fig. 5b presents another trie obtained from
the first one by adding the base address 00100111 , that required to create a
new internal node 0010011 * . Conversely, removing 00100111 from the trie
of Fig. 5b would give that of Fig. 5a.
Theoretical worst-case complexity of a lookup in a Patricia trie in our case
is O( k )where k is the word length (e.g. 32-bit or 64-bit). In practice, since a
program is allowed to allocate blocks in a limited memory space, the trie height
remains far below this upper bound. In addition, unlike for arbitrary strings,
comparisons for words can be very eciently implemented by bit operations
(see also Sec. 5).
Storage of a block metadata takes a few bytes, except for initialization in-
formation when the block itself is long. In this case, initialization of each byte
is monitored separately (bit-level initialization through bit-fields is not yet sup-
ported). To reduce the memory space occupied by the store, recording block
initialization information is optimized in two ways. First, since initialization of
each byte in a block can be recorded in one bit, block initialization is recorded
in a dynamically allocated array, whose size is therefore 8 times less than the
block size. Second, when none or all of the bytes of a memory block have been
initialized (that are very common cases), initialization array is freed. Instead, an
integer field counting initialized bytes is used. Third, the __full_init primitive
can be used to mark the whole block as initialized, avoiding multiple calls to
__initialize for particular bytes.
Experiments. To choose which datastructure is most appropriate for imple-
menting the store, we compared the implementation based on Patricia tries to
three other implementations of the store: based on linked lists, on unbalanced bi-
nary search trees, and on Splay trees used in earlier memory safety related tools
(see e.g. [16]). Our implementation appears to be in average more than 2500
times faster than linked lists, 200 times faster than unbalanced binary search
trees and 27 times faster than Splay trees. For linked lists and search trees, it
confirms the intuition given earlier in this section. The results for Splay trees
are comparable to Patricia tries on most examples, maybe 2-3 times faster for
some examples, and dramatically ( > 500 times) slower for examples (like multi-
plication of big matrices, matrix inversion etc.) where the program's consecutive
accesses to memory are not at all performed to the same memory blocks. The
reason is also intuitively clear: since Splay trees move recently accessed elements
to the top of the tree, this takes time and brings no benefit when the follow-
ing queries to the store are not related to the same memory blocks again. For
instance, since matrix multiplication requires to take elements in different rows
and columns each time, multiplication of big matrices, where all matrix elements
do not fit to the same memory block, results in loss of time due to useless restruc-
turing of the Splay tree. On the contrary, on programs with frequent consecutive
accesses to the same block metadata in the store, Splay trees appear to be (up
to three times in our examples) more ecient.
 
Search WWH ::




Custom Search