Database Reference
In-Depth Information
And for Y
={ A }
:
1
8
P ( Y
| D
, CT )
=
log 1
8
|
=−
=
L ( code ( Y
CT ))
3 . 00
So,
{
A , B , C
}
is assigned a code of length 0.68 bits, while
{
A , B
}
,
{
A
}
and
{
B
}
are
assigned codes of length 3 bits each.
Now, for any database
D
and code table CT over the same set of patterns
X
we
can compute L (
D |
CT ) according to the following trivial lemma.
Lemma 8.9
Let
D
be a database, CT be a code table over
X
and code-optimal
D
for
, cover a cover function, and usage the usage function for cover .
1. For any t
D
its encoded length, in bits, denoted by L ( t
|
CT ) ,is
L ( t
|
CT )
=
L ( code ( X
|
CT )) .
X
cover ( CT , t )
2. The encoded size of
D
, in bits, when encoded by CT , denoted by L (
D |
CT ) ,is
L (
D |
CT )
=
L ( t
|
CT ) .
t D
With Lemma 8.9, we can compute L (
D | M ), but we also need to know what L ( M )
is, i.e., the encoded size of a code table.
Recall that a code table is a two-column table consisting of patterns and codes.
As we know the size of each of the codes, the encoded size of the second column is
easily determined: it is simply the sum of the lengths of the codes. The encoding of
the first column, containing the patterns, depends on the pattern type; a lossless and
succinct encoding should be chosen.
Definition 8.10
Let
D
be a database, CT a code table over
X
that is code-optimal
for
D
, and encode an encoding for elements of
X
. The size of CT in bits, denoted
by L ( CT
| D
) , is given by
X CT : usage D ( X ) = 0 |
L ( CT
| D
)
=
encode ( X )
|+|
code ( X
|
CT )
|
.
Note that we do not take patterns with zero usage into account, because they are not
used to code and do not get a finite code length.
With these results we have the total size of the encoded database.
D
T
Definition 8.11
Let
be a database with tuples drawn from
, let CT be a code
D
table that is code-optimal for
and cover a cover function. The total compressed
size of the encoded database and the code table, in bits, denoted by L (
D
, CT ) is
given by
L (
D
, CT )
= L (
D | CT )
+ L ( CT | D
) .
Search WWH ::




Custom Search