Compression - Disappearing Cryptography

Cryptography Reference

In-Depth Information

Letter:

A

R

T

H

U

R

ASCII:

01000001

01010010

01010100

01001000

01010101

01010010

Compressed:

1000

1111

0010

00111

000100

1111

The Huffman algorithm can also be used to compress any type of

data, but its effectiveness varies. For instance, it could be used on a

photograph where the intensity at each pixel is stored as a byte. The

algorithm would be very effective on a photograph that had only a

fewbasicvaluesofblackandwhite,butitwouldn'tworkwellifthe

intensities were evenly distributed in a photograph with many even

shades betweendark and light. The algorithmworks best when there

are a few basic values.

More sophisticated versions of the Huffman code exist. It is com-

mon to construct second-order codes that aggregate pairs of letters.

This can be done in two ways. The easiest way to do this is to sim-

ply treat each pair of letters as the basic atomic unit. Instead of con-

structing a frequency table of characters, youwould construct a table

of pairs. The table would be much larger, but it would generate even

better compression because many of the pairs would rarely occur.

Pairs like “ZF” are almost nonexistent.

Another way is to construct 26 different tables by analyzing which

letters follow other letters. One table for the letter “T” would hold

the frequency that the other letters might follow after the “T”. The

letter “H” would be quite common in this table because “TH” occurs

frequently in English. These 26 tables would produce even more

compression because more detailed analysis would tune the code

word evenmore. The letter “U” would receive a very short code word

after the letter “Q” because it invariably follows.

This example has shown how a Huffman compression function

works in practice. It didn't explain how the code words were con-

structed nor did it showwhy they worked so well. The next section in

this chapter will do that.

Chapter 6 shows how to

run Huffman codes in

reverse.

5.3 Building Compression Algorithms

Creating a new compression algorithm has been one of the more

lucrative areas of mathematics and computer science. A few smart

ideas are enough to save people billions of dollars of storage space

and communications time, and so many have worked with the idea

in depth. This chapter won't investigate the best work because it is

beyond the scope of the topic. Many of the easiest ideas turn out to

hide information the best. Huffman codes are a perfect solution for

basic text. Dictionary algorithms, like Lempel-Ziv, are less effective.

Disappearing Cryptography

Search WWH ::

Custom Search

Home