Databases Reference
In-Depth Information
the symbol a 3 . Suppose the second symbol in the sequence was a 2 . The tag value would then
be restricted to lie in the interval [0.49, 0.56). We would then partition this interval in the same
proportion as the original interval to obtain the subintervals [0.49, 0.539) corresponding to
the symbol a 1 , [
.
,
.
)
corresponding to the symbol a 2 , and [0.546, 0.56) correspond-
ing to the symbol a 3 . If the third symbol was a 3 , the tag would be restricted to the interval
[0.546, 0.56), which could then be subdivided further. This process is described graphically
in Figure 4.1 .
Notice that the appearance of each new symbol restricts the tag to a subinterval that is
disjoint from any other subinterval that may have been generated using this process. For the
sequence beginning with a 1 ,
0
539
0
546
, by the time the third symbol a 3 is received, the tag
has been restricted to the subinterval [0.546, 0.56). If the third symbol had been a 1 instead
of a 3 , the tag would have resided in the subinterval [0.49, 0.539), which is disjoint from the
subinterval [0.546, 0.56). Even if the two sequences are identical from this point on (one
starting with a 1 ,
a 2 ,
a 3 ,...
a 2 ,
a 3 and the other beginning with a 1 ,
a 2 ,
a 1 ), the tag interval for the two
sequences will always be disjoint.
As we can see, the interval in which the tag for a particular sequence resides is disjoint
from all intervals in which the tag for any other sequence may reside. As such, any member
of this interval can be used as a tag. One popular choice is the lower limit of the interval;
another possibility is the midpoint of the interval. For the moment, let's use the midpoint of
the interval as the tag.
In order to see how the tag generation procedure works mathematically, we start with
sequences of length one. Suppose we have a source that puts out symbols from so m e alphabet
A ={
a 1 ,
a 2 ,...,
a m }
. We can map the symbols
{
a i }
to real numbers
{
i
}
. Define T X (
a i )
as
i 1
k
1
2 P
T X (
a i ) =
1 P
(
X
=
k
) +
(
X
=
i
)
(2)
=
1
2 P
=
F X (
i
1
) +
(
X
=
i
)
(3)
For each a i ,
T X (
a i )
will have a unique value. This value can be used as a unique tag for a i .
Example4.3.2:
Consider a simple dice-throwing experiment with a fair die. The outcomes of a roll of the die
can be mapped into the numbers
{
1
,
2
,...,
6
}
. For a fair die
1
6
P
(
X
=
k
) =
for k
=
1
,
2
,...,
6
Therefore, using ( 3 ) we can find the tag for X
=
2as
1
2 P
1
6 +
1
12 =
T X (
2
) =
P
(
X
=
1
) +
(
X
=
2
) =
0
.
25
 
Search WWH ::




Custom Search