All the ciphers we have studied so far use the same key K both for encryption and decryption. This implies that the source and the destination of the encrypted data have to share K. For this reason, this kind of ciphers are also known as symmetric-key ciphers. This aspect becomes problematic if we want cryptography to scale to big systems with many users willing to communicate securely. Unless we have a centralized service to handle keys (that we will discuss later), for N users this would require N(N-1)/2, i.e., O(N^{2}), keys. For example, for a LAN with 1000 users we would have ~500000 keys. These keys should be pre-distributed to users in a secure way (e.g., offline). This is totally impractical and would never scale on a wide-area network such as the Internet.

The above argument has been one of the main motivation leading to the development of **asymmetric-key cryptography**. In 1976 Diffie and Hellman suggested the existence of this revolutionary ciphers defined as follows:

#### Definition

An **asymmetric-key cipher** is a quintuple with and and such that:

- It is computationally easy to generate a key-pair ;
- It is computationally easy to compute ;
- It is computationally easy to compute ;
- Decryption under of a plaintext encrypted under gives the initial plaintext . Formally, ;
- It is computationally infeasible to compute knowing and ;
- It is computationally infeasible to compute knowing and and without knowing ;

Intuitively, instead of one key we now have a key pair composed of a **private** and a **public** key, respectively. Encryption is performed under while decryption under . Thus, decryption key is now different from encryption key. Items 1,2 and 3 state that it should be practical to generate key pairs and to encrypt/decrypt data. Item 4 states that decrypting under the private key a plaintext encrypted under the public key gives the initial plaintext.

Items 5, 6 state that the cipher should be secure regardless of the secrecy of the public key . This is the main challenge behind this kind of cryptography: the public and the private key have to be related, otherwise decryption would never work, but, at the same time, computing the private key from the public key should be impossible in practice (computationally infeasible).

#### One-way trap-door functions

Asymmetric-key ciphers are strictly related to *one-way trap-door functions*.

**Definition.** An injective, invertible function is **one-way**, iff

- is easy to compute;
- is infeasible to compute.

Note that one-way does not refer to the fact the function does not admit an inverse. One-way functions are invertible but computing their inverse is too expensive to be feasible in practice.

**Definition.** An injective, invertible family of functions is **one-way trap-door** , iff given

- is easy to compute;
- is infeasible to compute without knowing the secret
*trap-door*relative to .

Thus, intuitively, the trap-door is a hidden way to go back to the pre-image of the function: only knowing the trap-door we can compute the inverse of .

### The Merkle-Hellman knapsack system

We present now an example cipher that has been broken, but still gives a very immediate idea of how asymmetric-key ciphers relate to one-way trap-door functions. The cipher is based on the following NP-complete problem:

#### The subset-sum problem

Let s_{1},…,s_{n}, T be positive integers. s_{i} are *sizes* while T is the *target*. A solution to the subset-sum problem is a subset of (s_{1},…,s_{n}) whose sum is exactly the target T. Formally, the solution is a binary tuple (x_{1},…,x_{n}) such that x_{i}s_{i} = T.

For example, if sizes are (4,6,3,8,1) and T=11, we have that (0,0,1,1,0) and (1,1,0,0,1) are solutions, since 3+8 = 11 and 4+6+1 = 11.

This problem is NP-complete in general. As a consequence, we can easily obtain a **one-way function** from it. Notice, in fact, that NP problems have a non-deterministic polynomial solution meaning that checking if a solution is correct can be done in polynomial time. If we define ƒ(x_{1},…,x_{n}) = x_{i}s_{i}, we have that ƒ is clearly easy to compute but inverting this function amounts to finding (x_{1},…,x_{n}) from a target T which we know to be infeasible for a big n.

How can we now introduce a secret trap-door to allow us inverting the function? The trick is to start from a specific instance of the problem that is easy to solve.

#### An easy-to-compute instance

We consider special sizes that are *super-increasing*, i.e., such that , for each i>1. Intuitively, any s_{i} is bigger than the sum of all the previous s_{j}. For example (1,3,5,10) is super-increasing while (1,3,5,9) is not. In this special case there is a very efficient algorithm to solve the subset-sum problem. The idea is to start from the biggest element s_{n} and go back to the first one: if s_{i} fits into T we pick it (we set x_{i}=1) and we subtract s_{i} from T. Python example code follows:

```
def subsetSum(S,T): # assumes S is a super-increasing list of integers
x = []
S.reverse() # reverse list to start from the biggest
for s in S: # iterates on all s_i (from the biggest)
if s <= T:
x.append(1) # takes the element
T = T-s # subtracts it from T
else:
x.append(0) # does not take the element
if T==0: # solution found
x.reverse()
return x # returns the reversed tuple
else:
return [] # no solution found
```

For example

```
>>> subsetSum([1,3,5,10],11)
[1, 0, 0, 1]
>>> subsetSum([1,3,5,10],12)
[]
>>> subsetSum([1,3,5,10],13)
[0, 1, 0, 1]
>>> subsetSum([1,3,5,10],14)
[1, 1, 0, 1]
>>>
```

#### The cipher

We now proceed as follows

- We start from a super-increasing problem ;
- We choose a prime ;
- We choose a random such that ;
- We transform the initial super-increasing problem into , with . Notice that this problem is not super-increasing in general;

The trap-door is composed of the initial problem and values and , that are kept secret. Intuitively, encryption is done by computing the target for the problem . Since this is not super-increasing, finding the initial plaintext is infeasible (for a big n). However, knowing the secret trap-door we can derive from the ciphertext a target for the initial easy super-increasing problem, and then solve it using the efficient algorithm above. More precisely, encryption and decryption are defined as follows:

- is the solution of the super-increasing problem with target

Notice that is guaranteed to exist by the fact is a prime number. We will prove this in the next lesson. The correctness of the above cipher can be proved as follows:

Thus

meaning that the transformed target is, in fact, the target for the initial, easy problem.

**A small example.** Let (1,2,5,12) be a super-increasing problem. We let p=23 and a=6. We have that a^{-1} is 4, since 6•4 mod 23 = 1. If we compute s_{i} • a mod p we obtain (6,12,7,3) which is not super-increasing. Consider now the plaintext (1,0,0,1). We have that E_{PK}(1,0,0,1) = 6 + 3 = 9. Now to decrypt we have to compute a^{-1} • 9 mod p = 4 • 9 mod 23 = 13. We finally solve (1,2,5,12) with target 13:

```
>>> subsetSum([1,2,5,12],13)
[1, 0, 0, 1]
>>>
```

which gives the initial plaintext (1,0,0,1).