Databases Reference
In-Depth Information
T A B L E 3 . 32
Probabilities of occurrence of
the letters in the English
alphabet in the U.S.
Constitution.
Letter
Probability
Letter
Probability
A
0.057305
N
0.056035
B
0.014876
O
0.058215
C
0.025775
P
0.021034
D
0.026811
Q
0.000973
E
0.112578
R
0.048819
F
0.022875
S
0.060289
G
0.009523
T
0.078085
H
0.042915
U
0.018474
I
0.053475
V
0.009882
J
0.002031
W
0.007576
K
0.001016
X
0.002264
L
0.031403
Y
0.011702
M
0.015892
Z
0.001502
T A B L E 3 . 33
Probabilities of occurrence of
the letters in the English
alphabet in this chapter.
Letter
Probability
Letter
Probability
A
0.049855
N
0.048039
B
0.016100
O
0.050642
C
0.025835
P
0.015007
D
0.030232
Q
0.001509
E
0.097434
R
0.040492
F
0.019754
S
0.042657
G
0.012053
T
0.061142
H
0.035723
U
0.015794
I
0.048783
V
0.004988
J
0.000394
W
0.012207
K
0.002450
X
0.003413
L
0.025835
Y
0.008466
M
0.016494
Z
0.001050
the probability model for a different set of C programs. The probabilities in Table 3.32 are the
probabilities of the 26 letters (upper- and lowercase) obtained for the U.S. Constitution and
are representative of English text. The probabilities in Table 3.33 were obtained by counting
the frequency of occurrences of letters in an earlier version of this chapter. While the two
documents are substantially different, the two sets of probabilities are very much alike.
We encoded the earlier version of this chapter using Huffman codes that were created using
the probabilities of occurrence obtained from the chapter. The file size dropped from about
70,000 bytes to about 43,000 bytes with Huffman coding.
 
Search WWH ::




Custom Search