All the ciphers we have studied so far use the same key K both for encryption and decryption. This implies that the source and the destination of the encrypted data have to share K. For this reason, this kind of ciphers are also known as symmetric-key ciphers. This aspect becomes problematic if we want cryptography to scale to big systems with many users willing to communicate securely. Unless we have a centralized service to handle keys (that we will discuss later), for N users this would require N(N-1)/2, i.e., O(N2), keys. For example, for a LAN with 1000 users we would have ~500000 keys. These keys should be pre-distributed to users in a secure way (e.g., offline). This is totally impractical and would never scale on a wide-area network such as the Internet.
The above argument has been one of the main motivation leading to the development of asymmetric-key cryptography. In 1976 Diffie and Hellman suggested the existence of this revolutionary ciphers defined as follows:
Definition
An asymmetric-key cipher is a quintuple with and and such that:
- It is computationally easy to generate a key-pair ;
- It is computationally easy to compute ;
- It is computationally easy to compute ;
- Decryption under of a plaintext encrypted under gives the initial plaintext . Formally, ;
- It is computationally infeasible to compute knowing and ;
- It is computationally infeasible to compute knowing and and without knowing ;
Intuitively, instead of one key we now have a key pair composed of a private and a public key, respectively. Encryption is performed under while decryption under . Thus, decryption key is now different from encryption key. Items 1,2 and 3 state that it should be practical to generate key pairs and to encrypt/decrypt data. Item 4 states that decrypting under the private key a plaintext encrypted under the public key gives the initial plaintext.
Items 5, 6 state that the cipher should be secure regardless of the secrecy of the public key . This is the main challenge behind this kind of cryptography: the public and the private key have to be related, otherwise decryption would never work, but, at the same time, computing the private key from the public key should be impossible in practice (computationally infeasible).
One-way trap-door functions
Asymmetric-key ciphers are strictly related to one-way trap-door functions.
Definition. An injective, invertible function is one-way, iff
- is easy to compute;
- is infeasible to compute.
Note that one-way does not refer to the fact the function does not admit an inverse. One-way functions are invertible but computing their inverse is too expensive to be feasible in practice.
Definition. An injective, invertible family of functions is one-way trap-door , iff given
- is easy to compute;
- is infeasible to compute without knowing the secret trap-door relative to .
Thus, intuitively, the trap-door is a hidden way to go back to the pre-image of the function: only knowing the trap-door we can compute the inverse of .
The Merkle-Hellman knapsack system
We present now an example cipher that has been broken, but still gives a very immediate idea of how asymmetric-key ciphers relate to one-way trap-door functions. The cipher is based on the following NP-complete problem:
The subset-sum problem
Let s1,…,sn, T be positive integers. si are sizes while T is the target. A solution to the subset-sum problem is a subset of (s1,…,sn) whose sum is exactly the target T. Formally, the solution is a binary tuple (x1,…,xn) such that xisi = T.
For example, if sizes are (4,6,3,8,1) and T=11, we have that (0,0,1,1,0) and (1,1,0,0,1) are solutions, since 3+8 = 11 and 4+6+1 = 11.
This problem is NP-complete in general. As a consequence, we can easily obtain a one-way function from it. Notice, in fact, that NP problems have a non-deterministic polynomial solution meaning that checking if a solution is correct can be done in polynomial time. If we define ƒ(x1,…,xn) = xisi, we have that ƒ is clearly easy to compute but inverting this function amounts to finding (x1,…,xn) from a target T which we know to be infeasible for a big n.
How can we now introduce a secret trap-door to allow us inverting the function? The trick is to start from a specific instance of the problem that is easy to solve.
An easy-to-compute instance
We consider special sizes that are super-increasing, i.e., such that , for each i>1. Intuitively, any si is bigger than the sum of all the previous sj. For example (1,3,5,10) is super-increasing while (1,3,5,9) is not. In this special case there is a very efficient algorithm to solve the subset-sum problem. The idea is to start from the biggest element sn and go back to the first one: if si fits into T we pick it (we set xi=1) and we subtract si from T. Python example code follows:
def subsetSum(S,T): # assumes S is a super-increasing list of integers
x = []
S.reverse() # reverse list to start from the biggest
for s in S: # iterates on all s_i (from the biggest)
if s <= T:
x.append(1) # takes the element
T = T-s # subtracts it from T
else:
x.append(0) # does not take the element
if T==0: # solution found
x.reverse()
return x # returns the reversed tuple
else:
return [] # no solution found
For example
>>> subsetSum([1,3,5,10],11)
[1, 0, 0, 1]
>>> subsetSum([1,3,5,10],12)
[]
>>> subsetSum([1,3,5,10],13)
[0, 1, 0, 1]
>>> subsetSum([1,3,5,10],14)
[1, 1, 0, 1]
>>>
The cipher
We now proceed as follows
- We start from a super-increasing problem ;
- We choose a prime ;
- We choose a random such that ;
- We transform the initial super-increasing problem into , with . Notice that this problem is not super-increasing in general;
The trap-door is composed of the initial problem and values and , that are kept secret. Intuitively, encryption is done by computing the target for the problem . Since this is not super-increasing, finding the initial plaintext is infeasible (for a big n). However, knowing the secret trap-door we can derive from the ciphertext a target for the initial easy super-increasing problem, and then solve it using the efficient algorithm above. More precisely, encryption and decryption are defined as follows:
- is the solution of the super-increasing problem with target
Notice that is guaranteed to exist by the fact is a prime number. We will prove this in the next lesson. The correctness of the above cipher can be proved as follows:
Thus
meaning that the transformed target is, in fact, the target for the initial, easy problem.
A small example. Let (1,2,5,12) be a super-increasing problem. We let p=23 and a=6. We have that a-1 is 4, since 6•4 mod 23 = 1. If we compute si • a mod p we obtain (6,12,7,3) which is not super-increasing. Consider now the plaintext (1,0,0,1). We have that EPK(1,0,0,1) = 6 + 3 = 9. Now to decrypt we have to compute a-1 • 9 mod p = 4 • 9 mod 23 = 13. We finally solve (1,2,5,12) with target 13:
>>> subsetSum([1,2,5,12],13)
[1, 0, 0, 1]
>>>
which gives the initial plaintext (1,0,0,1).