Databases Reference
In-Depth Information
The value of 0.5 gets mapped to
1ti me s
00
m
1
...
0
The update equations remain almost the same as Equations ( 9 ) and ( 10 ). As we are going to
do integer arithmetic, we need to replace F X (
in these equations.
Define n j as the number of times the symbol j occurs in a sequence of length To t a l _ Count .
Then F X (
x
)
k
)
can be estimated by
i = 1 n i
F X (
k
) =
Total _ Count .
(27)
If we now define
k
Cum _ Count
(
k
) =
n i
i
=
1
we can write Equations ( 9 ) and ( 10 )as
u ( n 1 )
l ( n 1 ) +
(
1
) ×
Cum _ Count
(
x n
1
)
l ( n ) =
l ( n 1 ) +
(28)
Total _ Count
u ( n 1 )
l ( n 1 ) +
(
1
) ×
Cum _ Count
(
x n )
u ( n ) =
l ( n 1 ) +
1
(29)
Total _ Count
where x n is the n th symbol to be encoded,
is the largest integer less than or equal to x ,
and the addition and subtraction of one is to handle the effects of the integer arithmetic. Note
that in order to reduce effects of finite precision arithmetic it is a good idea to perform the
multiplication in the numerator prior to the division.
The word length m has to be large enough to accommodate all the upper and lower limits
of all the subintervals, which means that there should be enough values to unambiguously
represent each entry of the Cum_Count array. As the maximum number of distinct values is
Total_Count this means that the number of values that need to be represented is Total_Count
and, therefore, we need to pick m such that
x
2 m
>
Total _ Count
or
m
>
log 2 (
Total _ Count
)
u ( n ) ]
is only a portion of the total range available. As all the subintervals need to be contained within
the active interval at any given time, what we need to do is determine the smallest size the active
interval can be and then make sure that m is large enough to contain Total_Count different
values in this limited range. So what is the smallest the active interval can be? At first sight
it might seem that the smallest the active interval can be is about half the maximum range
l ( n ) ,
However, this may not be sufficient as often the active interval, that is the interval
[
 
Search WWH ::




Custom Search