Chen–Ho encoding

From Wikipedia, the free encyclopedia
Revision as of 19:58, 19 June 2025 by imported>OAbot (Open access bot: url-access updated in citation with #oabot.)
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Template:Short description Template:Use dmy dates Template:Use list-defined references Chen–Ho encoding is a memory-efficient alternate system of binary encoding for decimal digits.

The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10),[1] even when using packed BCD.

The encoding reduces the storage requirements of two decimal digits (100 states) from 8 to 7 bits, and those of three decimal digits (1000 states) from 12 to 10 bits using only simple Boolean transformations avoiding any complex arithmetic operations like a base conversion.

History

In what appears to have been a multiple discovery, some of the concepts behind what later became known as Chen–Ho encoding were independently developed by Theodore M. Hertz in 1969[2] and by Tien Chi Chen (Script error: No such module "Lang".) (1928–)[3][4][5][6] in 1971.

Hertz of Rockwell filed a patent for his encoding in 1969, which was granted in 1971.[2]

Chen first discussed his ideas with Irving Tze Ho (Script error: No such module "Lang".) (1921–2003)[7][8][9][10] in 1971. Chen and Ho were both working for IBM at the time, albeit in different locations.[11][12] Chen also consulted with Frank Chin Tung[13] to verify the results of his theories independently.[12] IBM filed a patent in their name in 1973, which was granted in 1974.[14] At least by 1973, Hertz's earlier work must have been known to them, as the patent cites his patent as prior art.[14]

With input from Joseph D. Rutledge and John C. McPherson,[15] the final version of the Chen–Ho encoding was circulated inside IBM in 1974[16] and published in 1975 in the journal Communications of the ACM.[15][17] This version included several refinements, primarily related to the application of the encoding system. It constitutes a Huffman-like prefix code.

The encoding was referred to as Chen and Ho's scheme in 1975,[18] Chen's encoding in 1982[19] and became known as Chen–Ho encoding or Chen–Ho algorithm since 2000.[17] After having filed a patent for it in 2001,[20] Michael F. Cowlishaw published a further refinement of Chen–Ho encoding known as densely packed decimal (DPD) encoding in IEE Proceedings – Computers and Digital Techniques in 2002.[21][22] DPD has subsequently been adopted as the decimal encoding used in the IEEE 754-2008 and ISO/IEC/IEEE 60559:2011 floating-point standards.

Application

Chen noted that the digits zero through seven were simply encoded using three binary digits of the corresponding octal group. He also postulated that one could use a flag to identify a different encoding for the digits eight and nine, which would be encoded using a single bit.

In practice, a series of Boolean transformations are applied to the stream of input bits, compressing BCD encoded digits from 12 bits per three digits to 10 bits per three digits. Reversed transformations are used to decode the resulting coded stream to BCD. Equivalent results can also be achieved by the use of a look-up table.

Chen–Ho encoding is limited to encoding sets of three decimal digits into groups of 10 bits (so called declets).[1] Of the 1024 states possible by using 10 bits, it leaves only 24 states unused[1] (with don't care bits typically set to 0 on write and ignored on read). With only 2.34% wastage it gives a 20% more efficient encoding than BCD with one digit in 4 bits.[12][17]

Both, Hertz and Chen also proposed similar, but less efficient, encoding schemes to compress sets of two decimal digits (requiring 8 bits in BCD) into groups of 7 bits.[2][12]

Larger sets of decimal digits could be divided into three- and two-digit groups.[2]

The patents also discuss the possibility to adapt the scheme to digits encoded in any other decimal codes than 8-4-2-1 BCD,[2] like f.e. Excess-3,[2] Excess-6, Jump-at-2, Jump-at-8, Gray, Glixon, O'Brien type-I and Gray–Stibitz code.Template:Efn The same principles could also be applied to other bases.

In 1973, some form of Chen–Ho encoding appears to have been utilized in the address conversion hardware of the optional IBM 7070/7074 emulation feature for the IBM System/370 Model 165 and 370 Model 168 computers.[23][24]

One prominent application uses a 128-bit register to store 33 decimal digits with a three digit exponent, effectively not less than what could be achieved using binary encoding (whereas BCD encoding would need 144 bits to store the same number of digits).

Script error: No such module "anchor".Encodings for two decimal digits

Script error: No such module "anchor".Hertz encoding

Hertz decimal data encoding for a single heptad (1969 form)[2]
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
12.5% (16 states) 1 1 0 c d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states) 1 0 1 f a b c 0abc 100f (0–7) (8–9) 16% (16 states)
12.5% (16 states, 4 used) 1 1 1 c x x f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)
12.5% (16 states, 0 used) 1 0 0 x x x x 0% (0 states)
  • This encoding is not parity-preserving.

Early Chen–Ho encoding, method A

Decimal data encoding for a single heptad (early 1971 form, method A)[12]
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
25% (32 states, 16 used) 1 0 x[12] (b)[15] c d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states) 1 1 0 f a b c 0abc 100f (0–7) (8–9) 16% (16 states)
12.5% (16 states, 4 used) 1 1 1 c x[12] (a)[15] x[12] (b)[15] f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)
  • This encoding is not parity-preserving.

Early Chen–Ho encoding, method B

Decimal data encoding for a single heptad (early 1971 form, method B)[12]
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
12.5% (16 states) 1 0 c 0 d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states, 4 used) 1 0 c 1 x x f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)
12.5% (16 states) 1 1 f 0 a b c 0abc 100f (0–7) (8–9) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states, 0 used) 1 1 x 1 x x x 0% (0 states)
  • This encoding is not parity-preserving.

Script error: No such module "anchor".Patented and final Chen–Ho encoding

Decimal data encoding for a single heptad (patented 1973 form[14] and final 1975 form[15])
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
25.0% (32 states, 16 used) 1 0 x[14] (b)[15] c d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states) 1 1 1 c a b f 0abc 100f (0–7) (8–9) 16% (16 states)
12.5% (16 states, 4 used) 1 1 0 c x[14] (a)[15] x[14] (b)[15] f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)

Script error: No such module "anchor".Encodings for three decimal digits

Script error: No such module "anchor".Hertz encoding

Hertz decimal data encoding for a single declet (1969 form)[2]
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b c d e f g h i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 c d e f g h i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 f a b c g h i 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 i a b c d e f 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 f 0 0 i a b c 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 c 0 1 i d e f 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 c 1 0 f g h i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 c 1 1 f (0) (0) i 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b2 and b1 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.

Early Chen–Ho encoding

Decimal data encoding for a single declet (early 1971 form)[12]
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b c d e f g h i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 c d e f g h i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 f g h i a b c 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 i a b c d e f 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 0 0 f i a b c 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 0 1 i c d e f 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 1 0 c f g h i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 1 1 c f i (0) (0) 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b1 and b0 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.

Patented Chen–Ho encoding

Decimal data encoding for a single declet (patented 1973 form)[14]
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b d e g h c f i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 d e g h c f i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 a b g h c f i 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 d e a b c f i 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 1 0 a b c f i 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 0 1 d e c f i 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 0 0 g h c f i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 1 1 (0) (0) c f i 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b4 and b3 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.[14]

Script error: No such module "anchor".Final Chen–Ho encoding

Chen-Ho decimal data encoding for a single declet (final 1975 form)[15][17]
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b c d e f g h i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 c d e f g h i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 c a b f g h i 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 c d e f a b i 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 c 0 0 f a b i 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 c 0 1 f d e i 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 c 1 0 f g h i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 c 1 1 f (0) (0) i 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b2 and b1 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.[15]

Storage efficiency

Storage efficiency
BCD Necessary bits Bit difference
Digits States Bits Binary code space Binary encoding [A] 2-digit encoding [B] 3-digit encoding [C] Mixed encoding Mixed vs. Binary Mixed vs. BCD
1 Template:Val 4 Template:Val 4 (7) (10) 4 [1×A] 0 0
2 Template:Val 8 Template:Val 7 7 (10) 7 [1×B] 0 −1
3 Template:Val 12 Template:Val 10 (14) 10 10 [1×C] 0 −2
4 Template:Val 16 Template:Val 14 14 (20) 14 [2×B] 0 −2
5 Template:Val 20 Template:Val 17 (21) (20) 17 [1×C+1×B] 0 −3
6 Template:Val 24 Template:Val 20 21 20 20 [2×C] 0 −4
7 Template:Val 28 Template:Val 24 (28) (30) 24 [2×C+1×A] 0 −4
8 Template:Val 32 Template:Val 27 28 (30) 27 [2×C+1×B] 0 −5
9 Template:Val 36 Template:Val 30 (35) 30 30 [3×C] 0 −6
10 Template:Val 40 Template:Val 34 35 (40) 34 [3×C+1×A] 0 −6
11 Template:Val 44 Template:Val 37 (42) (40) 37 [3×C+1×B] 0 −7
12 Template:Val 48 Template:Val 40 42 40 40 [4×C] 0 −8
13 Template:Val 52 Template:Val 44 (49) (50) 44 [4×C+1×A] 0 −8
14 Template:Val 56 Template:Val 47 49 (50) 47 [4×C+1×B] 0 −9
15 Template:Val 60 Template:Val 50 (56) 50 50 [5×C] 0 −10
16 Template:Val 64 Template:Val 54 56 (60) 54 [5×C+1×A] 0 −10
17 Template:Val 68 Template:Val 57 (63) (60) 57 [5×C+1×B] 0 −11
18 Template:Val 72 Template:Val 60 63 60 60 [6×C] 0 −12
19 Template:Val 76 Template:Val 64 (70) (70) 64 [6×C+1×A] 0 −12
20 80 67 70 (70) 67 [6×C+1×B] 0 −13
21 84 70 (77) 70 70 [7×C] 0 −14
22 88 74 77 (80) 74 [7×C+1×A] 0 −14
23 92 77 (84) (80) 77 [7×C+1×B] 0 −15
24 96 80 84 80 80 [8×C] 0 −16
25 100 84 (91) (90) 84 [8×C+1×A] 0 −16
26 104 87 91 (90) 87 [8×C+1×B] 0 −17
27 108 90 (98) 90 90 [9×C] 0 −18
28 112 94 98 (100) 94 [9×C+1×A] 0 −18
29 116 97 (105) (100) 97 [9×C+1×B] 0 −19
30 120 100 105 100 100 [10×C] 0 −20
31 124 103 (112) (110) 104 [10×C+1×A] +1 −20
32 128 107 112 (110) 107 [10×C+1×B] 0 −21
33 132 110 (119) 110 110 [11×C] 0 −22
34 136 113 119 (120) 114 [11×C+1×A] +1 −22
35 140 117 (126) (120) 117 [11×C+1×B] 0 −23
36 144 120 126 120 120 [12×C] 0 −24
37 148 123 (133) (130) 124 [12×C+1×A] +1 −24
38 152 127 133 (130) 127 [12×C+1×B] 0 −25

See also

Notes

Template:Notelist

References

Template:Reflist

Further reading

  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1".
  • Script error: No such module "citation/CS1". (60 pages) [1], Script error: No such module "citation/CS1". (40 pages) [2] and Script error: No such module "citation/CS1". (11 pages) [3] (NB. Three expired patents cited in both, the Template:Citeref and Template:Citerefs.)
  • Script error: No such module "Citation/CS1".
  • Script error: No such module "Citation/CS1".
  • Script error: No such module "Citation/CS1".
  1. a b c Cite error: Invalid <ref> tag; no text was provided for refs named Muller_2010
  2. a b c d e f g h Cite error: Invalid <ref> tag; no text was provided for refs named Hertz_1969
  3. Cite error: Invalid <ref> tag; no text was provided for refs named PT_1959
  4. Cite error: Invalid <ref> tag; no text was provided for refs named Parker_2003
  5. Cite error: Invalid <ref> tag; no text was provided for refs named Chen_2013
  6. Cite error: Invalid <ref> tag; no text was provided for refs named Wong_2014
  7. Cite error: Invalid <ref> tag; no text was provided for refs named SB_1979
  8. Cite error: Invalid <ref> tag; no text was provided for refs named Tseng_1988
  9. Cite error: Invalid <ref> tag; no text was provided for refs named FS_2000
  10. Cite error: Invalid <ref> tag; no text was provided for refs named MN_2003
  11. Cite error: Invalid <ref> tag; no text was provided for refs named Chen_1971_1
  12. a b c d e f g h i j Cite error: Invalid <ref> tag; no text was provided for refs named Chen_1971_2
  13. Cite error: Invalid <ref> tag; no text was provided for refs named Tung_2004
  14. a b c d e f g h i Cite error: Invalid <ref> tag; no text was provided for refs named Chen_1973_US3842414A
  15. a b c d e f g h i j k l Cite error: Invalid <ref> tag; no text was provided for refs named Chen_Ho_1975
  16. Cite error: Invalid <ref> tag; no text was provided for refs named Chen_Ho_1974
  17. a b c d Cite error: Invalid <ref> tag; no text was provided for refs named Cowlishaw_2000_CH
  18. Cite error: Invalid <ref> tag; no text was provided for refs named Smith_1975
  19. Cite error: Invalid <ref> tag; no text was provided for refs named Sacks-Davis_1982
  20. Cite error: Invalid <ref> tag; no text was provided for refs named Cowlishaw_2001_US6525679B1
  21. Cite error: Invalid <ref> tag; no text was provided for refs named Cowlishaw_2002
  22. Cite error: Invalid <ref> tag; no text was provided for refs named Cowlishaw_2007
  23. Cite error: Invalid <ref> tag; no text was provided for refs named Savard_2007_CH
  24. Cite error: Invalid <ref> tag; no text was provided for refs named IBM_1973_CF