Information Technology Reference
In-Depth Information
1
typedef unsigned char byte
;
2
// index 0 1 2 3 4 5 6 7 8
3
byte
masks[] = {0x00,0x80,0xC0,0xE0,0xF0,0xF8,0xFC,0xFE,0xFF};
4
int
longer [] = {
0, -1,
3, -3,
6, -5,
7,
8, -8};
5
int
shorter[] = {
0,
0,
1, -2,
2, -4,
5, -6, -7};
6
byte
gtCommonPrefixMask(
byte
a,
byte
b) {
7
byte
nxor = ˜(a ˆ b);
// a bit = 1 iff this bit is equal in a and b
int
i = 4;
// search starts in the middle of the word
8
while
(i > 0)
// if more comparisons needed
9
if
(nxor >= masks[i]) i = longer[i]; // first i bits equal,try a longer prefix
10
else
i = shorter[i];
// otherwise, try a shorter prefix
11
return
masks[-i];
// if i<=0, masks[-i] is the answer
12
13
}
Fig. 6.
Search for greatest common prefix mask, illustrated here for bytes
5 Optimized Records and Queries in the Store
Queries for adding, removing, or searching a given base address
A
in the store
based on a Patricia trie require comparisons of
A
with existing nodes and com-
putations of the greatest common prefix for two elements (cf Fig.5). For Patricia
tries storing addresses (strings of 0's and 1's of fixed length rather than strings
of arbitrary length over a wider set of characters), these comparisons may be
simplified due to the nature of elements. Let us call by the
greatest common
prefix mask
M
of
A
and
B
the mask containing 1's for the positions of common
bits in the greatest common prefix
P
of
A
and
B
.So
M
starts with
n
1's followed
by 0's, where
n
is the number of common bits in
P
. For example, the greatest
common prefix of bytes
A
=
01100111
and
B
=
01111111
is
P
=
111
*****
,
while the greatest common prefix mask is
M
=
11100000
.
We carefully optimized all prefix computations and comparisons by intensive
usage of ecient bit-to-bit operations. Interestingly, one optimization that we
realized for computation of the greatest common prefix mask appeared particu-
larly e
cient. Fig. 6 illustrates the optimized version, for simplicity, over bytes
instead of words. It is based on the classic dichotomic search of the index
i
such
that the greatest common prefix mask starts with exactly
i
1's. In addition to
precomputed masks (line 3) and bit operations (line 7), our version uses precom-
puted indices (lines 4,5) for the next prefix length
i
to try, therefore it avoids
the usual
mid=(high+low)/2
computation at each iteration, making frequent calls
to the function much faster. A negative value
i<=0
indicates that
-i
is the fi-
nal greatest common prefix length. The next value of
i
is simply extracted (lines
10,11) of the arrays depending if the next candidate prefix should be tried longer
or shorter. For instance, for
A
and
B
above,
nxor
equals the byte
11100111
,
and the function will try
i=4
,then
i=shorter[4]=2
,then
i=longer[2]=3
and finally
stop with
i=longer[3]=-3
andreturnthemask
masks[3]=0xE0
of length 3, that is in
binary precisely
11100000
.
Experiments.
We compared our optimized implementation to a non-optimized
version of the common prefix mask computation based on the usual comparison
of strings commonly used for Patricia tries (with a linear run over the elements
from left to right, that we have also optimized by bit operations). On the tested
examples, our optimized version illustrated by Fig. 6 makes the execution of the