Public Key Cryptography Notes

In symmetric key cryptography every pair of communicating entities must share an independent secret key.

n users ⇒ n(n-1)/2 independent keys (each user securely stores n-1 independent keys)

Public Key Cryptography: Every user generates a key pair (pk,sk), publishes pk and keeps sk secret.

1.2 Public Key Encryption Syntax

The public key typically contains some auxiliary information such the message space M_pk (or simply M) and the ciphertext space C_pk (or simply C), corresponding to the particular choice of λ and pk. The set of all possible key pairs produced by KeyGen for a particular value of λ is denoted as K_λ (or simply K).

Some of the algorithms are allowed fo fail (e.g., outputing a special rejection symbol ⊥) if they receive improper inputs (like badly formed keys, messages or ciphertexts).

When a user A wants to start receiving confidential information from other users, she produces a key pair (pk,sk) with KeyGen. Then she publishes pk and keeps sk secret as her long term key.

Now a user B can confidentially send a message m∈M_pk by computing a ciphertext c=Enc(pk,m) and sending it to A by an insecure channel.

1.3 Man-in-the-Middle Attack

If the public keys are not reliably transmitted to the sender, an impersonation attack can efficiently break the security of a public key encryption scheme:

1.4 Public Key Certification

Systems relying on the latter approach are called Identity Based Encryption schemes. The most widely used approach is the former, in what is called a Public Key Infrastructure.

1.5 Hybrid Encryption

Known public key encryption schemes are far less efficient than the practical symmetric key encryption schemes. A hybrid solution can benefit from the good properties of both types of schemes.

In a hybrid encryption scheme, a sender B uses a public key encryption scheme to send a random session (short term) key k, that will be used to encrypt the message with an efficient symmetric key encryption scheme. Namely:

Hybrid encryption schemes automatically support encryption of messages of arbitrary length from the mode of operation of the underlying symmetric key encryption scheme.

Therefore, the public key encryption scheme only needs to be able to encrypt short messages (since a symmetric key has only a few hundreds of bits).

1.6 Practical Constructions

1.6.1 ElGamal (1985)

ElGamal encryption scheme can be implemented on any cyclic finite group with an efficiently computable group operation, provided that some problems related to the discrete logarithm problem in the group are expected to be hard. The simplest example is using a large enough cyclic subgroup of the multiplicative group of a finite field.

The encryption scheme is an adaptation of a previously defined 2-party key agreement protocol.

1.6.1.1 Diffie-Hellman Key Agreement (1976)

In a 2-party key agreement protocol, two entities A and B interact sequentially sending some messages in such a way that at the end of the conversation both parties can derive a common key k, that is conjectured to be unknown to an attacker eavesdropping at the whole conversation.

This agreed secret key k can be used as the key in a symmetric encryption scheme, or in other symmetric key protocols, like a message authentication code.

In the Diffie-Hellman 2-party key agreement protocol, a finite cyclic group G of prime order q and a generator g are previously selected in a setup phase (e.g., they can be fixed in advance by a standarization body and shared among different pairs of users). Then, G and g are public parameters of the scheme.

Every party selects a nonzero random secret value x∈Z_q^× and computes the group element y=g^x.

The two parties, A and B, exchange the computed group elements y_A=g^x_A and y_B=g^x_B.

Each party locally compute the shared secret key k_AB=g^x_Ax_B. Namely k_AB=y_A^x_B=y_B^x_A.

It should be hard to compute the group element g^x_Ax_B only from G, g, g^x_A and g^x_B, for random x_A,x_B∈Z_q^×.

Notice than in this version of the protocol, there are no long term keys. That is, to generate a second common key, an independent protocol execution must be performed. In addition, the protocol described above is susceptible to man-in-the-middle attacks.

However, there are ways to transform the basic scheme into a more sophisticated one that uses long term secret values to generate several independent common keys. Then, the long term secret values can be certified, preventing man-in-the-middle attacks.

1.6.1.2 The Basic Encryption Scheme

ElGamal encryption transforms Diffie-Hellman key agreement into a public key encryption scheme by using the public/secret information of the recipient as a long term key pair and the public/secret information of the sender as a one time key pair. The agreed secret key is then used as a one-time pad to encrypt the message.

KeyGen(λ):
Select a cyclic group G of order q, where q is a λ-bit long prime, and a group generator g.
Choose a random x∈Z_q^× and compute the group element y=g^x.
Output (pk,sk)=((param,y),x), where param=(G,g,q).

Enc(pk,m):
Parse pk=(param,y).
Choose a random r∈Z_q^×.
Output c=(c₀,c₁)=(g^r,y^rm).

Dec(pk,sk,c):
Parse pk=(param,y).
Parse sk=x.
Parse c=(c₀,c₁).
Output m=c₀^-xc₁.

The security of ElGamal encryption is related to the amount of information an attacker can get about the group element g^rx from the group elements g, g^x and g^r.

In the original proposal, G is a prime order subgroup of the multiplicative group Z_p^×, where p is a prime number such that q divides p-1. For efficiency reasons the cofactor s in p=sq+1 must be small, but for security reasons p must be large (due to known subexponential attaks against the discrete logarithm problem).

1.6.1.3 Hashed ElGamal

A serious limitation of the basic ElGamal scheme is that the message space is a multiplicative subgroup of a finite field. Thus, a proper encoding function from binary strings to group elements is needed.

An alternative way to solve the problem is converting the agreed secret key g^rx into a binary string by using a hash function H : {0,1}^* → {0,1}^λ. This trick changes the message space to the binary strings of length λ.

KeyGen(λ):
Select a cyclic group G of order q, where q is a λ-bit long prime, and a group generator g.
Select a hash function H : {0,1}^* → {0,1}^λ.
Choose a random x∈Z_q^× and compute the group element y=g^x.
Output (pk,sk)=((param,y),x), where param=(G,g,q,H).

Enc(pk,m):
Parse pk=(param,y).
Choose a random r∈Z_q^×.
Output c=(c₀,c₁)=(g^r,H(y^r)⊕m).

Dec(pk,sk,c):
Parse pk=(param,y).
Parse sk=x.
Parse c=(c₀,c₁).
Output m=H(c₀^x)⊕c₁.

The security of Hashed ElGamal encryption is related to the amount of information an attacker can get about the group element H(g^rx) from the group elements g, g^x and g^r.

1.6.1.4 Elliptic Curves

ElGamal encryption can be defined on any finite cyclic group such that some problems related to the discrete logarithm problem are considered hard.

A prime order subgroup of the group of points of an elliptic curve is a natural choice to implement ElGamal encryption, since the group operation can be efficiently implemented, and there is no known specific efficient algorithm solving the corresponding discrete logarithm problem on a random elliptic curve.

As before, an encoding function from binary strings to points in the subgroup is necessary. Alternatively, a hashed version of the encryption scheme can be used.

Since additive notation is commonly used for elliptic curves, ElGamal encryption is described in a slightly different way:

param=(G,P,q), where P is a point generating the cyclic subgroup G of order q.
sk=x∈Z_q^× as before.
pk=(param,Y), where Y=xP.
Enc(pk,m)=(rP,rY+M), where M is now the message encoded as a point in G, and r∈Z_q^×.
Dec(pk,sk,C)=-xC₀+C₁.

1.6.2 RSA (1977)

The first proper public key encryption scheme was introduced by Rivest, Shamir and Adleman in 1977. Its security relies on the difficulty of factoring large integers, and uses two modular exponentiations which composition is the identity map.

The RSA modulus is the product of two random secret different primes p, q of the same binary length, n=pq. It is well known that for every integer x coprime with n, x^φ(n)≡1 (mod n), where φ(n)=(p-1)(p-1) is Euler's totient function.

Therefore, if e, d are two integers such that ed≡1 (mod φ(n)) and x is coprime with n, then x^ed≡x (mod n).

Notice that there exist efficient algorithms to compute modular exponentiations and modular inversions, and also to sample random prime numbers. Then, the above facts are the basis of the following public key encryption scheme:

In the following algorithms, residues modulo n are represented as integers between 0 and n-1, and similarly for residues modulo φ(n).

KeyGen(λ):
Select two different random prime numbers p, q of λ bits, and compute n=pq.
Choose a convenient public exponent e coprime with φ(n)=(p-1)(q-1) (typically e is a quite small prime, like 2⁴+1 or 2¹⁶+1), and compute a secret exponent d as the modular inverse of e modulo φ(n).
Output (pk,sk)=((n,e),(p,q,d)).

Actually, the decryption algorithm only requires the key components n and d, but in some optimized implementations, p and q are also used.

In most implementations of RSA, the public exponent e is fixed for all users, and the primes p, q are selected so that p-1 and q-1 are coprime with e (otherwise, they are resampled). Indeed, the complexity of the encryption algorithm (modular exponentiation) depends on the binary representation of the public exponent.

1.6.2.1 Deterministic Encryption

RSA encryption is deterministic, because for a given public key, all encryptions of the same message are equal. This lowers its security in the same way as in the ECB mode of operation of a block cipher. Therefore, in most practical applications a determistic public key encryption is combined with a randomized map (like a random padding or a block chaining mode of operation with a random initialization value).

However, the problem disappears if the public key deterministic encryption is only used to send encrypted random keys (as in a hybrid encryption scheme), because the probability of encrypting the same message twice is negligible.

1.6.3 Rabin (1979)

Rabin public key encryption is based on the squaring function modulo n=pq, for secret primes p, q.

KeyGen(λ):
Select two different random prime numbers p, q of λ bits such that p,q≡3 (mod 4), and compute n=pq.
Output (pk,sk)=(n,(p,q)).

Dec(pk,sk,c):
Parse pk=(n).
Parse sk=(p,q).
Compute m_p = c^(p+1)/4 mod p, and m_q = c^(q+1)/4 mod q.
Use the Chinese Remainder Theorem to compute the unique residue m∈Z_n such that m≡m_p (mod p) and m≡m_q (mod q).
Output m.

There are exactly 4 different square roots of a valid ciphertext (i.e., c=m² mod n), provided that the ciphertext is coprime with n. Namely, the 4 roots are obtained by applying the Chinese Remainder Theorem to the four pairs (m_p,m_q), (m_p,-m_q), (-m_p,m_q), (-m_p,-m_q). Therefore, some redundancy must me added to m in order to recognize which of the four square roots is the correct decryption.

Any (probabilistic) algorithm computing a square root of c modulo n=pq, for a random modulus n and a random quadratic residue c, with some probability ε can be converted into another algorithm computing the prime factors of n with probability at least ε/2. Thus, decrypting Rabin cryptosystem without the secret key is not easier than directly factoring the modulus.

1.6.4 Paillier (1999)

Paillier encryption scheme uses arithmetic modulo n², where n=pq is an RSA modulus.

KeyGen(λ):
Select two different random prime numbers p, q of λ bits, and compute n = pq.
Choose g∈Z_n² such that gⁿ ≡ 1 (mod n), but g^x ≢ 1 (mod n) for any smaller x.
Output (pk,sk) = ((n²,g),(p,q)).

Enc(pk,m):
Parse pk = (n²,g).
Choose a random r∈Z_n^×.
Output c = rⁿg^m mod n².

Dec(pk,sk,c):
Parse pk = (n²,g).
Parse sk = (p,q).
Compute s = c^φ(n) mod n².
Compute s₁ = (s-1)/n.
Compute t = g^φ(n) mod n².
Compute t₁ = (t-1)/n.
Output m = s₁(t₁)^-1 mod n.

Fixing g = n+1 makes both the encryption and decryption mode efficient. Indeed, (n+1)^m ≡ 1+mn (mod n²), and in decryption t₁ = φ(n) mod n = n-p-q+1.

Observe that an attacker capable to invert the RSA(n,n) encryption function (i.e., taking e=n) would be able to decrypt a Paillier ciphertext.

1.6.5 Regev (2009)

Some modern proposals of public key encryption schemes are based on hard computational problems defined over lattices.

KeyGen(λ):
Select parameters q, n, σ such that q is a prime between λ² and 2λ², n=(1+ε)(λ+1)log q for some positive ε, σ²=o(1/(λlog²λ)), e.g., σ²=1/(λlog³λ).
Select a random n×λ matrix A in Z_q, and a column vector s∈Z_q^λ.
Compute b=As+x mod q, where the n components of x are sampled from a discrete Gaussian distribution of variance σ².
Output (pk,sk) = ((param,A,b),(s)), where param=(λ,q,n,σ).

Enc(pk,m):
Parse pk = (param,A,b).
Choose a random binary row vector r∈{0,1}ⁿ.
Output c = (rA mod q, rb+m(q-1)/2 mod q), where m∈{0,1}.

Dec(pk,sk,c):
Parse pk = (param,A,b).
Parse sk = (s).
Parse c = (c₁,c₂).
Compute s = c₂-c₁s mod q.
Set t to the nearest element in {0,(q-1)/2} to s.
Output m = 0 if t = 0. Otherwise, output m = 1.

Correctness is not guaranteed with probability one, but the error probability can be made negligible.

The noise x added in the key generation is critical: if it is too small, the public key reveals the secret key. If it is too large, wrong decryption occurs with noticeable probability.

The efficiency of the scheme is not good, because it encrypts only one bit in a large ciphertext. But the scheme is conjectured to remain secure even when the attacker has access to a quantum computer.

Observe that the matrix A defines a lattice, and the vector b is near the (secret) lattice point As. Therefore, an attacker capable to find the nearest lattice point to a given point would be able to compute the secret key s.

1.7 One-Way Functions

The concept of public key cryptography implies the existence of functions that can be evaluated efficiently, but they are hard-to-invert, that is finding a preimage of a given random element in their range. For instance, given a public key and a message, it is easy to compute the corresponding ciphertext, but given the ciphertext and the public key, it is infeasible to recover the message.

Furthermore, if some auxiliary information is given (the secret key) the preimage computation becomes easy. This implies that the one-way function has a “trapdoor”.

Definition 1. A function f : {0,1}^* → {0,1}^* is called one-way if:

There exists an efficient algorithm A that evaluates f on all possible inputs, that is, there exists a polynomial P_A such that for all λ∈Z⁺ and all x∈{0,1}^λ, A(x) = f(x) and the running time of A is upper bounded by P_A(λ).
No efficient algorithm is able to invert f. More precisely, for any algorithm B, any polynomial P_B and any c∈Z⁺, either there exist λ∈Z⁺ and x∈{0,1}^λ such that the running time of B on input f(x) exceeds P_B, or the probability that B(f(x))∈f^-1(f(x)) for a random x∈{0,1}^λ only exceeds λ^-c at finitely many values (or no value) of λ∈Z⁺.

Definition 2. An algorithm A is called polynomial time if there exists a polynomial P_A such that on any input x of length λ the running time of A is upper bounded by P_A(λ).

Definition 3. A function ε : Z⁺ → R is called negligible if for any c∈Z⁺, |ε(λ)| > λ^-c only at finitely many values (or no value) of λ∈Z⁺. We will write ε∈negl, or ε(λ)∈negl(λ).

Now, in the definition of one-way function, A and B are polynomial time algorithms and the probability that B(f(x))∈f^-1(f(x)) is a negligible function in λ.

The previous function can be seen as the function family {f_λ : {0,1}^λ → {0,1}^*}_λ∈Z⁺. The definition of one-way function can be extended to more general function families:

Definition 4. A function family {f_i : X_i → Y_i}_i∈I, where I is a set of indices that is the disjoint union of {I_λ}_λ∈Z⁺, is called one-way if:

There exists a polynomial time algorithm A such that for any i∈I and x∈X_i, A(i,x) = f_i(x).
There exists polynomial time algorithms that for any λ∈Z⁺ it produces a uniformly distributed element i∈I_λ, and for any i∈I_λ it produces a uniformly distributed element in X_i.
For any polynomial time algorithm B the probability that B(i,f_i(x))∈f_i^-1(f_i(x)) for random i∈I_λ and x∈X_i is a negligible function in λ.

The indexes of a one-way functions are typically related to the public keys of an encryption scheme. For example, in RSA cryptosystem the encryption function's domain and range depends on the public key. Therefore, the set I would be the set of RSA public keys, and the corresponding functions would be f_(n,e) : Z_n → Z_n such that f_(n,e)(x) = x^e mod n.

It is an open problem to find a one-way function family (or even proof its existence), but some well-known families (like the previous RSA example) are conjectured to be one-way. Indeed, with a precise definition of security of a public key encryption scheme, it can be shown that the existence of a secure deterministic public key encryption scheme implies the existence of one-way functions.

1.7.1 Amplification

Given a function family {f_i : X_i → Y_i}_i∈I that cannot be inverted with probability greater than a given bound, one can define another family g with a much smaller bound in the probability of inversion, by considering an n-fold version of f:
g_i(x₁,...,x_n) = (f_i(x₁),...,f_i(x_n))
Indeed, any polynomial time algorithm B_g inverting g with probability ε_g can be used to build another polynomial time algorithm B_f that inverts f with a much greater probability ε_f as follows:

Notice that when B_f fails inverting f, B_g failed inverting the vector y in all t iterations. If the vectors were independent, the failure would occur with probability (1-ε_g)^t. Then it would be ε_f=1-(1-ε_g)^t, which is exponentially close to 1 for a large enough t. However, this analysis is not correct because the vectors y are correlated (actually, all of them have at least one component equal to the input y). But this correlation becomes negligible when n is large enough.

A further step in the direction of proving the existence of one-way functions is finding a weakened version of the definition.

Definition 5. A function family {f_i : X_i → Y_i}_i∈I, where I is a set of indices that is the disjoint union of {I_λ}_λ∈Z⁺, is called weakly one-way if:

There exists a polynomial time algorithm A such that for any i∈I and x∈X_i, A(i,x) = f_i(x).
There exists a positive polynomial Q such that for any polynomial time algorithm B the probability that B(i,f_i(x))∈f_i^-1(f_i(x)) for random i∈I_λ and x∈X_i is greater than 1-1/Q(λ) only for finitely many values of λ.

What is required in this weakened version is that any efficient algorithm trying to invert f fails with a non negligible probability, which seems to be a very relaxed statement. However, the two notions of one-wayness are indeed equivalent (from the existential point of view):

Proof. Trivially, every strong one-way function family is also weakly one-way. Two show the other implication it suffices to prove that a n-fold version of the weak one-way function family is strongly one-way, if n is large enough. For simplicity, we will omit the index i, because it takes the same fixed value along the proof.

Let us call B_f¹ to a single iteration of the algorithm B_f described above (i.e., taking t=1). The algorithm B_g used into it is any polynomial time algorithm trying to invert the n-fold version of f.

Let Bad be the set of x∈X such that B_f¹ can invert f(x) with probability less than a threshold α, and call Good to its complement. This threshold will be fixed later. Let β=Prob(x∈Bad) for a uniformly distributed x, that is, β is the fraction of elements of X lying on the set Bad. Observe that β is a function of α.

Let us denote by Fail(B_f | x∈X) the probability that B_f fails to invert f on f(x) for a random x∈X. By the weak one-wyness of f,

1/Q(λ) < Fail(B_f | x∈X) = Fail(B_f | x∈Bad)β + Fail(B_f | x∈Good)(1-β) ≤ β + Fail(B_f | x∈Good) ≤ β + (1-α)^t.

Taking t=λ/α the term (1-α)^t becomes exponentially small, and we have shown that β > 1/2Q(λ).

Now, we can use B_f¹ to find an upper bound to the probability that B_g inverts the n-fold version of f on the image of a random vector in Xⁿ. Let us call Succ(B_g) to this probability.

Succ(B_g) = Succ(B_g and x∈Xⁿ∖Goodⁿ) + Succ(B_g and x∈Goodⁿ) ≤ Succ(B_g and x∈Xⁿ∖Goodⁿ) + (1-β)ⁿ ≤ Succ(B_g and x∈Xⁿ∖Goodⁿ) + (1-1/2Q(λ))ⁿ.

Succ(B_g and x_j∈Bad) ≤ n Succ(B_f¹ and x_j∈Bad) ≤ n Succ(B_f¹ | x_j∈Bad) < αn.

The proof is almost concluded, because for any algorithm B_g trying to invert the n-fold version of f and any constant c∈Z⁺, we can take α=1/(2λ^cn²), so that Succ(B_g) < λ^-c. In other words, Succ(B_g)∈negl(λ).

1.7.2 Trapdoor One-Way Permutations

Public key encryption is tightly related to the concept of one-way function family, but the existence of a decryption algorithm implies a trapdoor mechanism that allows the efficient inversion of the one-way function family, given some auxiliary information like the secret key.

The definition of a trapdoor one-way permutation (i.e., a bijective map from and to the same set) families adds this functionality:

Definition 6. A permutation family {f_i : X_i → X_i}_i∈I, where I is a set of indices that is the disjoint union of {I_λ}_λ∈Z⁺, is called trapdoor one-way if:

There exists a polynomial time algorithm A such that for any i∈I and x∈X_i, A(i,x) = f_i(x).
There exists polynomial time algorithms that for any λ∈Z⁺ it produces a uniformly distributed element i∈I_λ and a trapdoor information t_i, and for any i∈I_λ it produces a uniformly distributed element in X_i.
There exists a polynomial time algorithm that on the input of i∈I, the corresponding t_i and f_i(x) for any x∈X_i, it outputs x.
For any polynomial time algorithm B the probability that B(i,f_i(x))=x for random i∈I_λ and x∈X_i is a negligible function in λ.

Some of the function families derived from the previously defined public key encryption schemes are conjectured to be trapdoor one-way families.

1.8 Hardcore Predicates

Even if a function f is one-way, it cannot be said that the image f(x) hides x. For instance, if f is one-way, so is the function g(w,x)=(w,f(x)), even when w is completely exposed in the image of the function. The notion of one-wayness only implies that a substantial part of the preimage x, but not the entire x, is hidden in f(x). Then, there is the need to know which part of x is actually hidden in f(x).

Given a one-way function f, we say that a map h : {0,1}^* → {0,1} (also called a predicate) is hardcore for f if no useful information of h(x) is leaked from f(x). More precisely:

Definition 7. The (balanced) predicate h : {0,1}^* → {0,1} is hardcore for a one-way function f if

There exists a polynomial time algorithm A computing h(x) from x.
For any polynomial time algorithm B the probability that the difference Prob[B(f(x))=h(x)]-1/2 for a random x∈X_λ is a negligible function of λ.

The difference between the success probability of algorithm B and 1/2 is the advantage of B in guessing h(x) with respect to a lazzy algorithm that simply ignores f(x) and flips a coin to decide its output.

We assume that h is a balanced predicate, that is, Prob[h(x)=1]-1/2 for a random x∈X_λ is a negligible function of λ.

Proving that h is a hardcore predicate for f is usually done by explicitly building a polynomial time algorithm B_f that inverts f with a non-negligible probability from any other polynomial time algorithm B_h that computes h(x) from f(x) with a non-negligible advantage. A partial proof applying only to perfect B_h (that is, computing the correct h(x) with probability one) is usually easy to obtain, but extending it to the general case often requires far more work.

1.8.1 Goldreich-Levin Hardcore Predicate

Given any one-way function f it can be modified to obtain a new one g(w,x)=(w,f(x)) and a hardcore predicate for it, called the Goldreich-Levin predicate, h(w,x)=w·x=w₁x₁+...+w_λx_λ mod 2.

A partial proof of this result, working only for perfect algorithms B_h is almost trivial: Just run B_h(e_j,y) for j=1,...,λ, where y=f(x) is the input of B_f and e_j has a single one-bit in the j-th position. If B_h is perfect, then B_h(e_j,f(x)) = h(e_j,x) = x_j, that is, the j-th bit of x.

However, if the success probability of B_h is close to 1/2, then the probability that all the λ calls to B_h give the correct answer will intuitively drop exponentially to 2^-λ. Therefore, a more involved strategy would be necessary to build a general proof.

1.8.2 Other Known Hardcore Predicates

Some well-known candidates to one-way functions have hardcore predicates that are just some of the bits of x. For instance, all bits of x are hardcore predicates for the RSA-based one-way function, or the most significant bits of the discrete-log based one-way function are also hardcore.

For example, let us consider the least significant bit of the RSA one-way function family f_(n,e)(x) = x^e mod n

We define the least significant bit of an element of Z_n as LSB(x) = 1 if x is an even integer and LSB(x) = 0 if x is an odd integer, assuming that x is represented as an integer in {0,...,n-1}. We define the sequence x_i=2ⁱx mod n, for i=1,...,λ.

Observe that y_i=f_(n,e)(x_i)=x_i^e=2^iey mod n, where y=f_(n,e)(x)=x^e mod n. Then, an algorithm B_RSA that tries to learn x from y can call a perfect algorithm B_LSB on inputs y₁,...,y_λ. It is easy to see that LSB(x₁)=0 if and only if 0≤x<n/2, LSB(x₂)=0 if and only if 0≤x<n/4 or n/2<x<3n/4, and so on. Therefore, x can be easily retrieved from the LSB values given by B_LSB.

As before, generalizing this argument for a non-perfect B_LSB is not a trivial task.

1.8.3 Public Key Encryption from a Hardcore Predicate

There is a (quite inefficient) generic public key encryption scheme based on any hardcore predicate h of a trapdoor one-way permutation family f.

KeyGen(λ):
Sample a random index i∈I_λ along with its corresponding trapdoor t_i.
Output (pk,sk)=(i,t_i).

Enc(pk,m):
Parse pk=(i).
Parse m as the bit sequence (m₁,...,m_n).
Sample a random x₀∈X_i.
Compute the sequence x_j=f_i(x_j-1) for j=1,...,n.
Compute the bit sequence b_j=h(x_j-1) for j=1,...,n.
Output c=(m₁⊕b₁,...,m_n⊕b_n,x_n).

Dec(pk,sk,c):
Parse pk=(i).
Parse sk=(t_i).
Parse c=(c₁,...,c_n,x_n).
Compute the sequence x_n-j=f_i^-1(x_n-j+1) for j=1,...,n using the trapdoor t_i.
Compute the bit sequence b_j=h(x_j-1) for j=1,...,n.
Output m=(c₁⊕b₁,...,c_n⊕b_n).

1.9 Homomorphic Encryption

Some public key encryption schemes respect some group operations both in the message space and the ciphertext space. Let us assume that + is a group operation in M and * is a group operation in C.

Definition 8. A public key encryption scheme (KeyGen,Enc,Dec) is called (weakly) homomorphic with respect to the operations (M,+) and (C,*) if
∀(pk,sk)∈K, ∀m₁,m₂∈M_pk, Dec(sk,Enc(pk,m₁)*Enc(pk,m₂)) = m₁+m₂.

Definition 9. A public key encryption scheme (KeyGen,Enc,Dec) is called (strongly) homomorphic with respect to the operations (M,+) and (C,*) if there exists an efficient algorithm Rerand such that
∀(pk,sk)∈K, ∀m₁,m₂∈M_pk, the vectors (c₁,c₂,Rerand(pk,c₁*c₂)) and (c₁,c₂,Enc(pk,m₁+m₂), where c₁=Enc(pk,m₁) and c₂=Enc(pk,m₂), are identically distributed random variables.

This definition means that a ciphertext for m₁+m₂ can be obtained directly from the ciphertexts c₁, c₂, without the knowledge of the messages and in an indistinguishable way (that is, the randomness in the new ciphertext is independent of the randomness contained in c₁ and c₂). Typically, Rerand(pk,c) outputs Enc(pk,0)*c, where 0 is the neutral element in (M,+).

For deterministic encryption schemes, the two notions are equivalent, with Rerand(pk,c)=c, since the ciphertexts contain no randomness.

Proof. Enc((n,e),m₁m₂) = (m₁m₂)^e mod n = m₁^em₂^e mod n = Enc((n,e),m₁)Enc((n,e),m₂) mod n.

Proof. Rerand(pk,Enc(pk,m₁)Enc(pk,m₂)) = Enc(pk,1)Enc(pk,m₁)Enc(pk,m₂) = (g^r₀,y^r₀)(g^r₁,y^r₁m₁)(g^r₂,y^r₂m₂) = (g^{r₀+r₁+r₂},y^{r₀+r₁+r₂}m₁m₂). On the other hand, Enc(pk,m₁m₂) = (g^r₃,y^r₃m₁m₂). The two ciphertexts are identically distributed, conditioned to the values of r₁ and r₂, because r₀ and r₃ are uniformly distributed.

Proof. Rerand(pk,Enc(pk,m₁)Enc(pk,m₂)) = Enc(pk,0)Enc(pk,m₁)Enc(pk,m₂) = r₀ⁿr₁ⁿg^m₁r₂ⁿg^m₂ = (r₀r₁r₂)ⁿg^m₁+m₂. On the other hand, Enc(pk,m₁+m₂) = r₃ⁿg^m₁+m₂. The two ciphertexts are identically distributed, conditioned to the values of r₁ and r₂, because r₀ and r₃ are uniformly distributed.

1.9.1 Applications

Homomorphic encryption has many applications, being homomorphic talling in electronic voting the most intuitive one.

In an electronic voting for a referendum (each individual vote can contain one of “yes” or “no”), the bulletin board sets up a public key encryption scheme with additive homomorphic properties, and keeps secretly the decryption key. Every voter encrypts 1 (for “yes”) or 0 (for “no”) and sends the resulting ciphertext to the board. In the end of the election, all the votes (ciphertext) are operated to obtain a ciphertext of the sum of all votes. The resulting ciphertext is decrypted to obtain the number of positive votes.

This is just a toy example of a electronic voting system, because there is no verification mechanism to avoid voting more than once (e.g., by encrypting a number greater than one or a negative number), there is no way to prevent voting in a way related to other voter (e.g., rerandomizing other's vote as the own vote), to mention a few examples of all possible attacks.

2 Digital Signatures

Digital signatures are the public key analogous to message authentication codes in symmetric cryptography. They provide at once message integrity, authentication and not repudiation by securely combining a secret key with the message, and making the signature verification publicly available.

In addition, digital signatures can be used to avoid impersonation attacks like the man-in-the-middle attack, by adding signatures of the users' public keys by a trusted authority (see below).

Digital signatures were invented in the same seminal paper where Diffie and Hellman (1976) introduced the public key cryptography concept.

2.1 Digital Signature Syntax

The public key typically contains some auxiliary information such the message space M_pk (or simply M) and the signature space S_pk (or simply S), corresponding to the particular choice of λ and pk. The set of all possible key pairs produced by KeyGen for a particular value of λ is denoted as K_λ (or simply K).

Some of the algorithms are allowed fo fail (e.g., outputing a special rejection symbol ⊥) if they receive improper inputs (like badly formed keys, messages or signatures).

When a user A wants to be able to sign messages for other users, she produces a key pair (pk,sk) with KeyGen. Then she publishes pk and keeps sk secret as her long term key.

Now A can compute a digital signature for a message m∈M_pk by computing s=Sig(sk,m).

Any other user B can verify the signature s on m with A's public key by checking whether Ver(pk,m,s)=1.

Security of a signature scheme refers to the hardness of generating a valid signature without knowing the secret key in any realistic attack scenario, like knowing some valid pairs message/signature from the target signer.

A signature can authenticate some extra information like place, time and purpose of the signature, as in conventional handwritten signatures, by appending in an unambiguous way the extra information to the original message.

Dealing with arbitrary length documents typically imply the use of a cryptographically secure hash function. Indeed, the chosen hash function is part of the specification of the signature scheme, and what is essentially signed is not the document itself but its hash value. This technique is referred to as the “hash-and-sign” paradigm.

2.2 Practical Constructions

2.2.1 Plain RSA (1977) and FDH (1993)

From the very close analogy between the syntax of encryption schemes and signatures, some public key encryption schemes can be transformed into digital signature schemes by using Dec as the signature algorithm and using Enc in the verification of the signature. This is precisely the case of RSA, as proposed in the seminal paper.

KeyGen(λ): (same as in the encryption scheme)
Select two different random prime numbers p, q of λ bits, and compute n=pq.
Choose a public exponent e coprime with φ(n)=(p-1)(q-1) (typically e is a quite small prime), and compute a secret exponent d as the modular inverse of e modulo φ(n).
Output (pk,sk)=((n,e),(p,q,d)).

The scheme is clearly insecure because an attacker can generate an arbitrary pair (m,s)=(x^e mod n,x) for a random x, that is accepted by the verification algorithm. Even when the resulting message m would be nonsense, this kind of attack must be prevented. Indeed, signature schemes are sometimes used to sign random values instead of structured messages with high redundancy.

The plain RSA signature can be improved by adding some redundancy (e.g., padding) to the message before signing it. However, a better solution is applying the hash-and-sign paradigm to the plain RSA signature, because this solves two problems at once: preventing the descibed attack and allowing messages of arbitrary length to be signed.

The Full Domain Hash (FDH) signature scheme combines any one-way trapdoor permutation and a secure hash function to build a signature scheme. In the case of RSA, the resulting scheme is the following:

Observe that the previously described attack is no longer possible, since now the attacker must give a preimage of x^e mod n by the hash function, which seems to be an infeasible task. However, the only known security proofs are in the Random Oracle idealized model (so, they are no actual security proofs, but just heuristic security arguments).

2.2.2 ElGamal (1984) and Pointcheval-Stern (1996)

ElGamal signature scheme is based on the difficulty of computing discrete logarithms on some cyclic groups.

KeyGen(λ):
Choose a prime number p of λ bits.
Choose a generator g of the multiplicative group Z_p^×.
Choose a random x∈Z_p-1 and compute y=g^x mod p.
Output (pk,sk)=((param,y),x), where param=(g,p).

Sig(pk,sk,m):
Parse pk=(param,y).
Parse sk=(x).
Choose a random k∈Z_p-1^× and compute r=g^k mod p.
Compute t=(m-xr)k^-1 mod p-1.
If t=0, repeat the procedure by choosing a new random k.
Output s=(r,t).

Ver(pk,m,s):
Parse pk=(param,y).
Parse s=(r,t).
Output 1 if g^m ≡ r^ty^r (mod p), and 0 otherwise.

This signature scheme is vulnerable in the same way as the basic RSA signature. Indeed, for arbitrary values of α∈Z_p-1^× and β∈Z_p-1, the pair s=(r,t), where r=y^αg^β mod p, t=-rα^-1 mod p-1, is a valid signature for the message m=tβ mod p.

Following the hash-then-sign paradigm, Pointcheval and Stern (1996) added the use of a hash function to this basic ElGamal signature scheme to prevent the previous forgery attack. A security proof in the Random Oracle Model exists for the modified scheme.

2.2.3 DSA (1991)

DSA signature is a variant of ElGamal signature scheme that was standarized as the Digital Signature Standard (DSS) in 1994.

KeyGen(λ):
Choose a prime number p of λ₁ bits (typically λ₁ is taken between 6λ and 10λ).
Choose a prime number q of λ bits that divides p-1.
Choose a generator g of a cyclic subgroup of order q in the multiplicative group Z_p^×.
Choose a secure hash function H : {0,1}^* → {0,1}^λ.
Choose a random x∈Z_q and compute y=g^x mod p.
Output (pk,sk)=((param,y),x), where param=(p,g,q,H).

Sig(pk,sk,m):
Parse pk=(param,y).
Parse sk=(x).
Choose a random k∈Z_q^× and compute r=(g^k mod p) mod q.
If r=0, repeat the procedure by choosing a new random k.
Compute t=(H(m)+xr)k^-1 mod q.
If t=0, repeat the procedure by choosing a new random k.
Output s=(r,t).

Ver(pk,m,s):
Parse pk=(param,y).
Parse s=(r,t).
Compute u=(g^H(m)y^r)^{t^-1} mod p.
Output 1 if r ≡ u (mod q), and 0 otherwise.

There exists some security proofs in the Random Oracle Model that require some extra assumptions. The proofs become simpler (but still in the Random Oracle Model) if H(m) is replaced by H(r,m).

2.3 Identification Schemes

The goal of an identification schemes is to verify the identity of a party involved in a protocol. For instance, the party can show that she knows the secret key sk corresponding to a public key pk (without revealing sk).

From a digital signature, it is easy to build an identification scheme. Namely, the owner of the secret key sk can sign a random message generated by the verifier with the key. Then the verifier can check whether the signature is valid with the corresponding public key pk. No one can impersonate the owner of sk, unless the signature scheme is insecure.

However, some identification schemes can be converted into digital signature schemes.

A special family of 3-moves protocols, called Sigma protocols, can be used to identify a party (called the Prover) to another one (called the Verifier). Assuming that the prover's public key pk is known to the verifier, the prover starts the protocol sending some committing information a. Then, the verifier sends a challenge c to the prover, who answers with a response t. Finally, the verifier checks whether the conversation (a,c,t) is a convincing one for the public key pk.

Correctness: If the prover's private input is (pk,sk) and the verifier's private input is (pk) and both parties follow the protocol honestly, then the verifier accepts the conversation with probability 1.

Soundness: If the prover's private input is just (pk) then the probability that a honest verifier accepts the conversation is noticeably less than 1.

A sequential repetition of the protocol can amplify the soundness to force the acceptance probability to a negligible function of λ.

Zero knowledge: (Informal) Whatever a honest verifier can compute after the protocol, can also be computed without interacting with a honest prover (implying that the verifier learns nothing about sk in a protocol execution).

2.3.1 Schnorr Identification Scheme

A typical example of sigma protocol is Schnorr's identification scheme, which is based on the hardness of computing discrete logarithms in a group G:

KeyGen(λ): (run by the prover)
Choose a prime number q of λ bits.
Choose a (cyclic) group G and a generator g.
Choose a random x∈Z_q and compute the group element y=g^x∈G.
Output (pk,sk)=((param,y),x), where param=(G,g,q).

The prover P sends pk to the verifier V and both start the 3-moves conversation:

P(pk,sk) ↔ V(pk):
P samples a random r∈Z_q and computes the commitment a=g^r.
P sends a to V.
V choses a random challenge c∈Z_q, and sends it back to P.
P computes the response t=r+cx using the secret key.
P sends t to V.
V accepts the conversation if g^t=ay^c.

If a (possibly dishonest) prover P can make a honest verifier V accept a conversation (a,c,t) with probability greater than 1/q, then there exists a commitment a such that P can compute t₁ and t₂ for some different c₁ and c₂ such that both (a,c₁,t₁) and (a,c₂,t₂) are accepted by V. But this implies that P can compute x from the two conversations. Indeed:

g^t₁=ay^c₁ and g^t₂=ay^c₂ ⇒ g^t₁-t₂=y^c₁-c₂ ⇒ x=(t₁-t₂)(c₁-c₂)^-1 mod q.

Therefore, a prover that does not know sk can only convince the verifier with a probability at most 1/q, unless it is able to solve the discrete logarithm problem in G.

Finally, the honest conversation (a,c,t) can easily be generated by V, without interacting with P, by computing a=y^-cg^t for randomly chosen c,t∈Z_q.

2.3.2 Fiat-Shamir Conversion

A sigma protocol can be converted into a non-interactive identification scheme by replacing the challenge c by the hash of the public information pk and the commitment a. Then the prover only computes (a,t) with the (interactive) sigma protocol but using as challenge c=H(pk,a).

The equivalence between the interactive and the non-interactive protocols can only be proven in the Random Oracle Model. But the non-interactive protocol can be turned easily into a signature scheme by adding the message m to the argument of the hash function. That is, the challenge is now computed as c=H(pk,a,m), and the signature is the pair s=(a,t)

2.3.3 Schnorr Signature (1990)

Applying Fiat-Shamir transformation to Schnorr's identification scheme we obtain Schnorr's signature scheme:

KeyGen(λ):
Choose a prime number q of λ bits.
Choose a (cyclic) group G and a generator g.
Choose a secure hash function H : {0,1}^* → Z_q.
Choose a random x∈Z_q and compute the group element y=g^x∈G.
Output (pk,sk)=((param,y),x), where param=(G,g,q,H).

Sig(pk,sk,m):
Parse pk=(param,y).
Parse sk=(x).
Choose a random r∈Z_q and compute a=g^r.
Compute c=H(y,a,m), and t=r+cx.
Output s=(a,t).

Ver(pk,m,s):
Parse pk=(param,y).
Parse s=(a,t).
Compute c=H(y,a,m).
Output 1 if g^t=ay^c, and 0 otherwise.

2.4 Certificates

As mentioned before, to avoid impersonation attacks like the man-in-the-middel attack the public keys must be unambiguously linked to the corresponding identities. The typical way to do it in practice is to concentrate the trust into a single certification authority (CA) that signs the public key/identity pairs, along with some extra information about the validity period and the intended use of the public keys.

A public key certificate consists of a structure containing the owner identity, the associated public key, some issuage information (like the issuer identity and the issuing time), the public key expiry date and the public key intended usage (public key encryption algorithms or public key signature algorithms for which the key is valid). This structure is signed with the secret key of the CA, so that any user that trusts it (and knows the public key of the CA) can verify the integrity of the certificate, and the validity of the public key contained in it.

A hierarchy of certification authorities can be based on the same model. A root CA can issue certificates for the intermediate CAs, by signing their signing public keys. The intermediate CAs can certify other CAs' public keys, or they can directly certify the public keys of the final users. Then, to verify the validity of a particular public key, the chain of trust from the root CA to the final user is used, and every certificate in the chain is verified with the public key of the issuer.

In some cases, a CA can enforce security by asking a user to prove in zero-knowledge that she knows the corresponding secret key. For instance, Schnorr's identification scheme can be used to prove knowledge of x∈Z_q such that the public key ((G,g,q),y) satisfies y=g^x without revealing x.

Contents

1 Public Key Encryption

1.1 Cryptography in the Multiuser Setting