The Mathematics of Secrets by Joshua Holden takes readers on a tour of the mathematics behind cryptography. Most books about cryptography are organized historically, or around how codes and ciphers have been used in government and military intelligence or bank transactions. Holden instead focuses on how mathematical principles underpin the ways that different codes and ciphers operate. Discussing the majority of ancient and modern ciphers currently known, The Mathematics of Secrets sheds light on both code making and code breaking. Over the next few weeks, we’ll be running a series of cipher challenges from Joshua Holden. The last post was on subliminal channels. Today’s is on binary ciphers:
Binary numerals, as most people know, represent numbers using only the digits 0 and 1. They are very common in modern ciphers due to their use in computers, and they frequently represent letters of the alphabet. A numeral like 10010 could represent the (1 · 24 + 0 · 23 + 0 · 22 + 1 · 2 + 0)th = 18th letter of the alphabet, or r. So the entire alphabet would be:
plaintext: a b c d e f g h i j ciphertext: 00001 00010 00011 00100 00101 00110 00111 01000 01001 01010 plaintext: k l m n o p q r s t ciphertext: 01011 01100 01101 01110 01111 10000 10001 10010 10011 10100 plaintext: u v w x y z ciphertext: 10101 10110 10111 11000 11001 11010
The first use of a binary numeral system in cryptography, however, was well before the advent of digital computers. Sir Francis Bacon alluded to this cipher in 1605 in his work Of the Proficience and Advancement of Learning, Divine and Humane and published it in 1623 in the enlarged Latin version De Augmentis Scientarum. In this system not only the meaning but the very existence of the message is hidden in an innocuous “covertext.” We will give a modern English example.
Suppose we want to encrypt the word “not” into the covertext “I wrote Shakespeare.” First convert the plaintext into binary numerals:
plaintext: n o t ciphertext: 01110 01111 10100
Then stick the digits together into a string:
Now we need what Bacon called a “biformed alphabet,” that is, one where each letter can have a “0-form” and a “1-form.”We will use roman letters for our 0-form and italic for our 1-form. Then for each letter of the covertext, if the corresponding digit in the ciphertext is 0, use the 0-form, and if the digit is 1 use the 1-form:
0 11100 111110100xx I wrote Shakespeare.
Any leftover letters can be ignored, and we leave in spaces and punctuation to make the covertext look more realistic. Of course, it still looks odd with two different typefaces—Bacon’s examples were more subtle, although it’s a tricky business to get two alphabets that are similar enough to fool the casual observer but distinct enough to allow for accurate decryption.
Ciphers with binary numerals were reinvented many years later for use with the telegraph and then the printing telegraph, or teletypewriter. The first of these were technically not cryptographic since they were intended for convenience rather than secrecy. We could call them nonsecret ciphers, although for historical reasons they are usually called codes or sometimes encodings. The most well-known nonsecret encoding is probably the Morse code used for telegraphs and early radio, although Morse code does not use binary numerals. In 1833, Gauss, whom we met in Chapter 1, and the physicist Wilhelm Weber invented probably the first telegraph code, using essentially the same system of 5 binary digits as Bacon. Jean-Maurice-Émile Baudot used the same idea for his Baudot code when he invented his teletypewriter system in 1874. And the Baudot code is the one that Gilbert S. Vernam had in front of him in 1917 when his team at AT&T was asked to investigate the security of teletypewriter communications.
Vernam realized that he could take the string of binary digits produced by the Baudot code and encrypt it by combining each digit from the plaintext with a corresponding digit from the key according to the rules:
0 ⊕ 0 = 0
0 ⊕ 1 = 1
1 ⊕ 0 = 1
1 ⊕ 1 = 0
For example, the digits 10010, which ordinarily represent 18, and the digits 01110, which ordinarily represent 14, would be combined to get:
|1 0 0 1 0|
|⊕||0 1 1 1 0|
|1 1 1 0 0|
This gives 11100, which ordinarily represents 28—not the usual sum of 18 and 14.
Some of the systems that AT&T was using were equipped to automatically send messages using a paper tape, which could be punched with holes in 5 columns—a hole indicated a 1 in the Baudot code and no hole indicated a 0. Vernam configured the teletypewriter to combine each digit represented by the plaintext tape to the corresponding digit from a second tape punched with key characters. The resulting ciphertext is sent over the telegraph lines as usual.
At the other end, Bob feeds an identical copy of the tape through the same circuitry. Notice that doing the same operation twice gives you back the original value for each rule:
(0 ⊕ 0) ⊕ 0 = 0 ⊕ 0 = 0
(0 ⊕ 1) ⊕ 1 = 1 ⊕ 1 = 0
(1 ⊕ 0) ⊕ 0 = 1 ⊕ 0 = 1
(1 ⊕ 1) ⊕ 1 = 0 ⊕ 1 = 1
Thus the same operation at Bob’s end cancels out the key, and the teletypewriter can print the plaintext. Vernam’s invention and its further developments became extremely important in modern ciphers such as the ones in Sections 4.3 and 5.2 of The Mathematics of Secrets.
But let’s finish this post by going back to Bacon’s cipher. I’ve changed it up a little — the covertext below is made up of two different kinds of words, not two different kinds of letters. Can you figure out the two different kinds and decipher the hidden message?
It’s very important always to understand that students and examiners of cryptography are often confused in considering our Francis Bacon and another Bacon: esteemed Roger. It is easy to address even issues as evidently confusing as one of this nature. It becomes clear when you observe they lived different eras.
Answer to Cipher challenge #2 from Joshua Holden: Subliminal channels
Given the hints, a good first assumption is that the ciphertext numbers have to be combined in such a way as to get rid of all of the fractions and give a whole number between 1 and 52. If you look carefully, you’ll see that 1/5 is always paired with 3/5, 2/5 with 1/5, 3/5 with 4/5, and 4/5 with 2/5. In each case, twice the first one plus the second one gives you a whole number:
2 × (1/5) + 3/5 = 5/5 = 1
2 × (2/5) + 1/5 = 5/5 = 1
2 × (3/5) + 4/5 = 10/5 = 2
2 × (4/5) + 2/5 = 10/5 = 2
Also, twice the second one minus the first one gives you a whole number:
2 × (3/5) – 1/5 = 5/5 = 1
2 × (1/5) – 2/5 = 0/5 = 0
2 × (4/5) – 3/5 = 5/5 = 1
2 × (2/5) – 4/5 = 0/5 = 0
to the ciphertext gives the first plaintext:
39 31 45 45 27 33 31 40 47 39 28 31 44 41 40 31 35 45 46 34 31 39 31 30 35 47 39 m e s s a g e n u m b e r o n e i s t h e m e d i u m
to the ciphertext gives the second plaintext:
20 8 5 19 5 3 15 14 4 16 12 1 9 14 20 5 24 20 9 19 1 20 12 1 18 7 5 t h e s e c o n d p l a i n t e x t i s a t l a r g e
To deduce the encryption process, we have to solve our two equations for C1 and C2. Subtracting the second equation from twice the first gives:
Adding the first equation to twice the second gives:
Joshua Holden is professor of mathematics at the Rose-Hulman Institute of Technology.