Asymmetric-key ciphers

All the ciphers we have studied so far use the same key K both for encryption and decryption. This implies that the source and the destination of the encrypted data have to share K. For this reason, this kind of ciphers are also known as symmetric-key ciphers. This aspect becomes problematic if we want cryptography to scale to big systems with many users willing to communicate securely. Unless we have a centralized service to handle keys (that we will discuss later), for N users this would require N(N-1)/2, i.e., O(N2), keys. For example, for a LAN with 1000 users we would have ~500000 keys. These keys should be pre-distributed to users in a secure way (e.g., offline). This is totally impractical and would never scale on a wide-area network such as the Internet.

The above argument has been one of the main motivation leading to the development of asymmetric-key cryptography. In 1976 Diffie and Hellman suggested the existence of this revolutionary ciphers defined as follows:

Definition

An asymmetric-key cipher is a quintuple ({\cal P},{\cal C},{\cal K}_S\times{\cal K}_P,E,D) with E : {\cal K}_P \times {\cal P} \rightarrow {\cal C} and D : {\cal K}_S \times {\cal C} \rightarrow {\cal P} and such that:

  1. It is computationally easy to generate a key-pair (SK, PK) \in{\cal K}_S\times{\cal K}_P;
  2. It is computationally easy to compute y = E_{PK}(x) ;
  3. It is computationally easy to compute x = D_{SK}(y) ;
  4. Decryption under SK of a plaintext x encrypted under PK gives the initial plaintext x. Formally, D_{SK}(E_{PK}(x)) = x;
  5. It is computationally infeasible to compute SK knowing PK and y;
  6. It is computationally infeasible to compute D_{SK}(y) knowing PK and y and without knowing SK;

Intuitively, instead of one key we now have a key pair (SK,PK) composed of a private and a public key, respectively. Encryption is performed under PK while decryption under SK. Thus, decryption key is now different from encryption key. Items 1,2 and 3 state that it should be practical to generate key pairs and to encrypt/decrypt data. Item 4 states that decrypting under the private key a plaintext encrypted under the public key gives the initial plaintext.
Items 5, 6 state that the cipher should be secure regardless of the secrecy of the public key PK.  This is the main challenge behind this kind of cryptography: the public and the private key have to be related, otherwise decryption would never work, but, at the same time, computing the private key from the public key should be impossible in practice (computationally infeasible).

One-way trap-door functions

Asymmetric-key ciphers are strictly related to one-way trap-door functions.

Definition. An injective, invertible function f is one-way, iff

  1. y = f(x) is easy to compute;
  2. x = f^{-1}(y) is infeasible to compute.

Note that one-way does not refer to the fact the function does not admit an inverse. One-way functions are invertible but computing their inverse is too expensive to be feasible in practice.

Definition. An injective, invertible family of functions f_K is one-way trap-door , iff given K

  1. y = f_K(x) is easy to compute;
  2. x = f^{-1}_K(y) is infeasible to compute without knowing the secret trap-door S(K) relative to K.

Thus, intuitively, the trap-door is a hidden way to go back to the pre-image x of the function: only knowing the trap-door we can compute the inverse of f_K.

The Merkle-Hellman knapsack system

We present now an example cipher that has been broken, but still gives a very immediate idea of how asymmetric-key ciphers relate to one-way trap-door functions. The cipher is based on the following NP-complete problem:

The subset-sum problem

Let s1,…,sn, T be positive integers. si are sizes while T is the target. A solution to the subset-sum problem is a subset of (s1,…,sn) whose sum is exactly the target T. Formally, the solution is a binary tuple (x1,…,xn) such that \sf\sum_{i=1}^{n}{} xisi = T.
For example, if sizes are (4,6,3,8,1) and T=11, we have that (0,0,1,1,0) and (1,1,0,0,1) are solutions, since 3+8 = 11 and 4+6+1 = 11.

This problem is NP-complete in general. As a consequence, we can easily obtain a one-way function from it. Notice, in fact, that NP problems have a non-deterministic polynomial solution meaning that checking if a solution is correct can be done in polynomial time. If we define ƒ(x1,…,xn) = \sf\sum_{i=1}^{n}{} xisi, we have that ƒ is clearly easy to compute but inverting this function amounts to finding (x1,…,xn) from a target T which we know to be infeasible for a big n.

How can we now introduce a secret trap-door to allow us inverting the function? The trick is to start from a specific instance of the problem that is easy to solve.

An easy-to-compute instance

We consider special sizes that are super-increasing, i.e., such that \sf s_i > \sum_{j=1}^{i-1} s_j, for each i>1. Intuitively, any si is bigger than the sum of all the previous sj. For example (1,3,5,10) is super-increasing while (1,3,5,9) is not. In this special case there is a very efficient algorithm to solve the subset-sum problem. The idea is to start from the biggest element sn and go back to the first one: if si fits into T we pick it (we set xi=1) and we subtract si from T. Python example code follows:

def subsetSum(S,T):  # assumes S is a super-increasing list of integers
    x = []
    S.reverse()           # reverse list to start from the biggest
    for s in S:           # iterates on all s_i (from the biggest)
        if s <= T:
            x.append(1)   # takes the element
            T = T-s       # subtracts it from T
        else:
            x.append(0)   # does not take the element
    if T==0:              # solution found
        x.reverse()
        return x          # returns the reversed tuple
    else:
        return []         # no solution found

For example

>>> subsetSum([1,3,5,10],11)
[1, 0, 0, 1]
>>> subsetSum([1,3,5,10],12)
[]
>>> subsetSum([1,3,5,10],13)
[0, 1, 0, 1]
>>> subsetSum([1,3,5,10],14)
[1, 1, 0, 1]
>>> 

The cipher

We now proceed as follows

  1. We start from a super-increasing problem (s_1,\ldots,s_n);
  2. We choose a prime p > \sum_{i=1}^{n}{s_i};
  3. We choose a random a such that 1 < a < p;
  4. We transform the initial super-increasing problem into (\hat{s_1}, \ldots, \hat{s_n}), with \hat{s_i}=a s_i \mod p. Notice that this problem is not super-increasing in general;

The trap-door is composed of the initial problem and values p and a, that are kept secret. Intuitively, encryption is done by computing the target for the problem (\hat{s_1}, \ldots, \hat{s_n}). Since this is not super-increasing, finding the initial plaintext x_1, \ldots, x_n is infeasible (for a big n). However, knowing the secret trap-door we can derive from the ciphertext a target for the initial easy super-increasing problem, and then solve it using the efficient algorithm above. More precisely, encryption and decryption are defined as follows:

  • E_{PK}(x_1, \ldots, x_n) = \sum_{i=1}^{n}{x_i \hat{s_i}}
  • D_{SK}(y) is the solution of the super-increasing problem (s_1,\ldots,s_n) with target a^{-1} y \mod p

Notice that a^{-1} \mod p is guaranteed to exist by the fact p is a prime number. We will prove this in the next lesson. The correctness of the above cipher can be proved as follows:

\begin{array}{rcl}E_{PK}(x_1, \ldots, x_n) &=& \sum_{i=1}^{n}{x_i \hat{s_i}}\\ & = & \sum_{i=1}^{n}{a x_i  s_i \mod p}\\& = & a\sum_{i=1}^{n}{x_i  s_i \mod p}\end{array}

Thus

\begin{array}{rcl}a^{-1}E_{PK}(x_1, \ldots, x_n) \mod p &=& a^{-1} a\sum_{i=1}^{n}{x_i  s_i \mod p}\\ &=& \sum_{i=1}^{n}{x_i  s_i \mod p}\\ &=& \sum_{i=1}^{n}{x_i  s_i}\\\end{array}

meaning that the transformed target a^{-1} y \mod p is, in fact, the target for the initial, easy problem.

A small example. Let (1,2,5,12) be a super-increasing problem. We let p=23 and a=6. We have that a-1 is 4, since 6•4 mod 23 = 1. If we compute si • a mod p we obtain (6,12,7,3) which is not super-increasing. Consider now the plaintext (1,0,0,1). We have that EPK(1,0,0,1) = 6 + 3 = 9. Now to decrypt we have to compute a-1 • 9 mod p = 4 • 9 mod 23 = 13. We finally solve (1,2,5,12) with target 13:

>>> subsetSum([1,2,5,12],13)
[1, 0, 0, 1]
>>>

which gives the initial plaintext (1,0,0,1).