Perfect ciphers

We  now discuss a theoretical result on the security of cryptosystems. We ask whether perfect ciphers exists, i.e., ciphers that can never be broken, even with after an unlimited time. Interestingly, we will see that these ideal ciphers exist and can be implemented in practice but they are, in fact, unpractical. The theory, developed by Claude Shannon, assumes an only-ciphertext model of the attacker, i.e., the attacker only knows the ciphertext y and tries to find plaintext x or key k.

We call p_{\cal P}(x) the probability of a plaintext x to occur and p_{\cal K}(k) the probability of a certain key k to be used as encryption key. These two probability distributions induce a probability distribution on the ciphertexts. In fact, given a plaintext and a key there exists a unique corresponding ciphertext.

We can compute such a probability distribution as follows

p_{\cal C}(y) = \displaystyle \sum_{k \in {\cal K}, \exists x . E_k(x)=y} {p_{\cal K}(k) p_{\cal P}(D_k(y))}

Given a ciphertext y we look for all the keys that can give such a ciphertext from some plaintext x. We then sum the probability of all such keys times the probability of the corresponding plaintext.

We can also compute the conditional probability of a ciphertext y with respect to a plaintext x. This gives a measure of how likely is a certain ciphertext once we fix a plaintext.

p_{\cal C}(y|x) = \displaystyle \sum_{k \in {\cal K}, E_k(x)=y} {p_{\cal K}(k) }

It is simply the sum of the probability of all keys giving y from x.

Example. Consider the following toy-cipher with {\cal P} = \{a,b\}, {\cal K} = \{k1,k2\} and {\cal C}=\{1,2,3\}. Encryption is defined by the following table:

\begin{array}{c|cc|} E & a & b\\\hline k1 & 1 & 2\\k2 & 2 & 3\\\hline\end{array}

We let p_{\cal P}(a) = 3/4, p_{\cal P}(b) = 1/4, p_{\cal K}(k1) = p_{\cal K}(k2) = 1/2. Let us now compute p_{\cal C}(1)

\begin{array}{r} p_{\cal C}(1) = \displaystyle \sum_{k \in {\cal K}, \exists x . E_k(x)=1} {p_{\cal K}(k) p_{\cal P}(D_k(1))} = p_{\cal K}(k1) p_{\cal P}(D_k(1)) =\\= p_{\cal K}(k1) p_{\cal P}(a) = 1/2 \times 3/4 = 3/8\end{array}.

In fact, k1 is the only key giving ciphertext 1. More interestingly, we can compute the conditional probability of ciphertext 1 with respect to the two plaintexts a and b:

p_{\cal C}(1|a) = \displaystyle \sum_{k \in {\cal K}, E_k(x)=y} {p_{\cal K}(k) } = p_{\cal K}(k1) = 1/2

p_{\cal C}(1|b) = \displaystyle \sum_{k \in {\cal K}, E_k(x)=y} {p_{\cal K}(k) } =0

Notice, in particular, that 1 can never be obtained from b.

Once we have these values, the idea is to compute the conditional probability of a plaintext with respect to a ciphertext. This is very related to the security of the cipher, since it is a measure of how likely is a plaintext once a ciphertext is observed (which is what the attacker is usually interested to know). Interestingly, this conditional probability can be computed through that Bayes theorem:

p_{\cal P}(x|y) =\displaystyle \frac{p_{\cal P}(x)p_{\cal C}(y|x)}{p_{\cal C}(y)}

Example. We can now compute the probabilities of plaintexts a and b with respect to ciphertext 1. We obtain:

p_{\cal P}(a|1) = \displaystyle \frac{p_{\cal P}(a)p_{\cal C}(1|a)}{p_{\cal C}(1)} = \displaystyle \frac{3/4 \times 1/2}{3/8} = 1

p_{\cal P}(b|1) = \displaystyle \frac{1/4 \times 0}{3/8} = 0

Thus, when observing 1 we are sure it is plaintext a, meaning that this cipher is completely insecure. The same happen if we compute the probabilities of a and b when observing 3. For ciphertext 2 we have an interesting, less extreme, situation. We obtain, in fact, p_{\cal P}(a/2) = 3/4 and p_{\cal P}(b/2) = 1/4, thus a is more likely than b when 2 is observed. This might suggest that even for this ciphertext the attacker gains information (even if partial) about the plaintext. However, it is important to notice that p_{\cal P}(a) = 3/4, p_{\cal P}(b) = 1/4, i.e., that the probability of the two plaintexts, when 2 is observed, is exactly the one they occur in a message. In fact, observing 3 does not change anything.

We now give the definition of perfect cipher:

Definition (Perfect cipher). A cipher is perfect iff p_{\cal P}(x|y)=p_{\cal P}(x) for all x \in {\cal P} and y \in {\cal C}.

Intuitively, a cipher is perfect if observing a ciphertext y gives no information about any of the possible plaintexts x. The cipher in the example is far from being perfect, but it satisfies the above definition for ciphertext 2.

Exercise. Prove that shift cipher with p_{\cal K}(k) = \frac{1}{|{\cal K}|}=\frac{1}{26}, i.e., with keys picked at random for each letter of the plaintex, is a perfect cipher.

Solution. Shift cipher is defined as E_k(x) = x+k \mod 26 with {\cal P}={\cal C}={\cal K}=\mathbb{Z}_{26}. We compute the probability of a generic ciphertext y as:

\begin{array}{rcl} p_{\cal C}(y) &=&\displaystyle \sum_{k \in {\cal K}, \exists x . E_k(x)=y} {p_{\cal K}(k) p_{\cal P}(D_k(y))}\\&=&\displaystyle \frac{1}{26}\sum_{k \in {\cal K}, \exists x . E_k(x)=y} { p_{\cal P}(D_k(y))}\\&=&\displaystyle \frac{1}{26}\sum_{k \in {\cal K}} { p_{\cal P}(y-k \mod 26)}\\&=&\displaystyle \frac{1}{26}\sum_{x \in {\cal P}} { p_{\cal P}(x)}=\frac{1}{26}\end{array}

The last two steps hold since for each key k, we always have a plaintext that gives y when encrypted under k. This plaintext is exactly y-k \mod 26. So the constraint \exists x . E_k(x)=y always holds and D_k(y) = y-k \mod 26. Then, it is sufficient to observe that y-k \mod 26 for all possible keys gives exactly the set of all possible plaintexts {\cal P} and the sum of all their probabilities gives 1. We can now compute

p_{\cal C}(y|x) = \displaystyle \sum_{k \in {\cal K}, E_k(x)=y} {p_{\cal K}(k) } = p_{\cal K}(y-x \mod 26) = \frac{1}{26}

Here it is enough to observe that, given x and y, there exists a unique key that encrypts x as y, which is precisely y-x\mod 26. Now Bayes theorem gives:

p_{\cal P}(x|y) =\displaystyle \frac{p_{\cal P}(x)p_{\cal C}(y|x)}{p_{\cal C}(y)}=\frac{p_{\cal P}(x)\frac{1}{26}}{\frac{1}{26}} = p_{\cal P}(x)

which gives the thesis.

We have seen that if we change key any time we encrypt a letter, a cipher as simple as the shift cipher becomes perfect, i.e., unbreakable. We now present two general results that, in fact, show that this strong requirement is indeed necessary and we cannot hope to develop perfect ciphers without it.

Theorem 1. Let p_{\cal C}(y)>0 for all y. A cipher is perfect only if |{\cal K}| \geq |{\cal P}|.

The theorem states a necessary condition of a cipher to be perfect: it must be that the number of keys is at least the same as the number of plaintexts. To see why this holds, we first notice that by Bayes theorem we have that a cipher is perfect if an only if p_{\cal C}(y|x)=p_{\cal C}(y) for all x \in {\cal P} and y \in {\cal C}. If we fix x we obtain that for each y p_{\cal C}(y|x)=p_{\cal C}(y)>0 meaning that there exists at least one key k such that E_k(x) = y (otherwise we would have p_{\cal C}(y|x)=0). Notice also that all such keys are different since E_k is a function and we have fixed x. In fact, x cannot be mapped to two different ciphertexts by the same key. Thus we have at least one key for each ciphertext meaning that |{\cal K}| \geq |{\cal C}|. Since, for any cipher, E_k injects the set of plaintexts into the set of ciphertext, we also have |{\cal C}| \geq |{\cal P}|, which gives the thesis |{\cal K}| \geq |{\cal P}|.

Exercise. Prove that the ciphertext defined as E_k(x_1, \ldots, x_d) = (y_1 + k, \ldots, y_d+k) \mod 26 is not perfect.

Solution 1. We notice that the cipher uses the same key to encrypt a whole tuple of d letters. Formally we have {\cal P} = {\cal C} = \mathbb{Z}_{26}^d and {\cal K} = \mathbb{Z}_{26}. We can directly apply the above theorem noticing that |{\cal P}| = 26^d > 26 = |{\cal K}|. Thus the cipher is not perfect.

Solution 2. There is an alternative, more direct, method to prove this fact, using a counterexample. We consider a possible plaintext GOOFY with d=5. Since the word can occur we have p_{\cal P}(GOOFY) > 0. Now it is enough to exhibit a ciphertext that cannot be obtained from GOOFY with the above cipher. Since it applies the very same shift to all the letters, it is enough to pick a word that cannot be obtained from GOOFY in that way. For example AAAAA or GOOFZ or ABCDE. For these ciphertexts we have p_{\cal P}(GOOFY|AAAAA) = p_{\cal P}(GOOFY|GOOFZ) = p_{\cal P} (GOOFY|ABCDE) = 0 \neq p_{\cal P}(GOOFY). Thus for such pairs of plaintext/ciphertext the property of perfect ciphers does not hold meaning that the cipher is not perfect.

We conclude by giving a characterizations of perfect ciphers in the case |{\cal P}| = |{\cal C}| = |{\cal K}|.

Theorem 2. Let |{\cal P}| = |{\cal C}| = |{\cal K}|. A cipher is perfect if and only if

  1. p_{\cal K}(k) = \frac{1}{|{\cal K}|} for all k \in {\cal K}
  2. for each x \in {\cal P} and y \in {\cal C} there exists exactly one key k such that E_k(x)=y

Intuitively, the theorem states that for a cipher to be perfect (under the hypothesis that the size of the set of plaintexts, ciphertexts and key is the same) keys should be picked at random for any encryption and each plaintext is mapped into each ciphertext through a unique key.

Proof: We prove that a perfect cipher implies the two above conditions. We leave the other side of the implication as an exercise. In Theorem 1 we have seen that, for perfect ciphers, if we fix x we obtain that for each y p_{\cal C}(y|x)=p_{\cal C}(y)>0 meaning that there exists at least one key k such that E_k(x) = y and all of these keys are different. Thus we have |{\cal K}| \geq |{\cal C}|. In this theorem we have assumed |{\cal K}| = |{\cal C}|, meaning that all of these keys k are unique (otherwise we would have |{\cal K}| > |{\cal C}|). Since this holds for each x and y we have proved condition 2. To prove condition 1, it is enough to notice that p_{\cal C}(y|x)=p_{\cal K}(k), i.e., the probability of y given x is equal to the probability of the unique key k that encrypts x into y. Thus, p_{\cal K}(k) = p_{\cal C}(y|x)=p_{\cal C}(y). If we fix y and we consider all possible plaintexts x we obtain all possible keys k and for all of them it holds p_{\cal K}(k) =p_{\cal C}(y), with p_{\cal C}(y) constant. Given that the sum of the probability of all keys must be 1, we obtain p_{\cal K}(k) = \frac{1}{|{\cal K}|} which proves condition 1.

The one-time-pad

We conclude giving a famous example of a perfect cipher that has been used in practice. This cipher has been used for the telegraph and is a binary variant of Vigenére with keys picked at random. More precisely we have {\cal P} = {\cal C}= {\cal K} = \mathbb{Z}_2^d with p_{\cal K}(k) = \frac{1}{|{\cal K}|} = \frac{1}{2^d} for all k \in {\cal K}. Encryption is defined as E_{(k_1, \ldots, k_d)}(x_1,\ldots,x_d) = (x_1 \oplus k_1, \ldots, x_d \oplus k_d) where \oplus is the bitwise xor operation. We notice that the premise of Theorem 2 holds (set sizes are the same). Also condition 1 holds, i.e., p_{\cal K}(k) = \frac{1}{|{\cal K}|}. We only need to prove condition 2. Let x \in {\cal P} and y \in {\cal C}. We have that the unique key giving y from x is computed as x \oplus y, i.e., (x_1 \oplus y_1, \ldots, x_d \oplus y_d). We thus conclude that the cipher is perfect.

Lesson learned

Shannon theory on perfect ciphers shows that such ideal ciphers exist but require as many keys as the possible plaintexts, and keys need to be picked at random for each encryption. Even if this makes such ciphers unpractical, the one-time-pad has been used for real transmission. The setup consisted of two identical books with thousands of  “random” keys. Each key was used once (from which the name one-time). Once the book had been used completely, new shared books were necessary.

Challenge: the four-time-pad (gives extra score)

It has been intercepted the following message encrypted under the mysterious four-time-pad cipher. We need skilled cryptanalysts to crack it. Would you help us, please?