Cryptology - FME (UPC)
Jorge L. Villar, 2024
In modern cryptography, security of cryptosystems is addressed in a formal way under what is known as provable security. This formalization uses precise definitions of security notions as well as a detailed specification of the algorithms and protocols that conform a cryptosystem.
Security is analyzed as an asymptotic property, as a function of a complexity parameter λ (also known as “security parameter”), that is usually related to the size of the cryptographic keys involved in the system.
The first notion that needs to be formalized is the meaning of a problem or a task to be easy or hard. The tasks to be carried out by honest parties must be easy enough, what is normally understood as tasks that can be solved in polynomial time by probabilistic algorithms, with overwhelming probability (typically, 1-negl(λ)).
The tasks that are considered as attacks must be infeasible for an adversary with a reasonable amount of resources. The complexity class BPP is used as the class of problems that are feasible to an attacker.
A problem family is called decisional if the solution to any of its instances is “yes” or “no” (or simply, a single bit). If the problem instance is specified as a binary string, then the problem family can be modelled as a subset L1 of {0,1}*, called a language, consisting of all the encodings of the yes-instances of the problem family P. For simplicity, we assume that any binary string not in the language is considered as a no-instance.
Definition 1. A decision problem family P is in BPP if for any probabilistic polynomial time algorithm B
The probability that B outputs 1 on any yes-instance is greater than 2/3.
The probability that B outputs 1 on any no-instance is less than 1/3.
The two thresholds can be widely modified without changing the meaning of the definition. For instance, instead of 2/3 and 1/3, they could be 1/2+1/Q(λ) and 1/2-1/Q(λ) for any arbitrary polynomial Q. Similarly, they can also be replaced by 1-2-λ and 2-λ.
We say that a problem is infeasible for an adversary if it is not in the class BPP.
The most commonly used complexity classes (P, NP, BPP, ...) are based on a worst-case analysis: the complexity of a problem family is defined by its hardest instances, no matter how many instances are hard enough. However, in cryptography we are interested in the average-case complexity, because secret keys and other elements in a cryptosystem are chosen at random. Therefore, the possible easy instances in a problem family must be bounded to a negligible fraction of the family. Then, the probability distribution of the yes-instances and the no-instances of a problem must be specified.
Definition 2. A decisional problem family is considered hard with respect to given probability distributions of yes- and no-instances, D1,λ and D0,λ, if for any probabilistic polynomial time algorithm B
|Prob(B(x)=1 | x←$ D1,λ) - Prob(B(x)=1 | x←$ D0,λ)| ∈negl(λ)
In this case, we say that the two distributions D1,λ and D0,λ are polynomially indistinguishable, and the above probability difference is called the advantage of B in distinguishing both distributions.
Example 1 (The Decisional Diffie Hellman Problem (DDH)). Given an instance generator algorithm I that on the input of λ it generates a cyclic group G of prime order q, where q is λ bits long, together with a generator g of the group, the DDH problem associated to I is the following:
Given (G,q,g) obtained from I(λ), choose a random bit b∈{0,1} and sample a tuple x from the probability distribution Db, where
D1,λ = (G,q,g,gu,gv,guv) for uniformly distributed u,v∈Zq,
D0,λ = (G,q,g,gu,gv,gw) for uniformly distributed u,v,w∈Zq.
The problem is to guess the bit b only from the tuple x.
The DDH Assumption associated to the instance generator I says that the two probability distributions D0λ, D1λ are polynomially indintinguishable.
Some attack goals can be modelled with decisional problem families that fit in the previous definition. However, in some other tasks the attacker is supposed to find a large object instead of producing just one bit. In these cases, the associated problem family is called a search problem family (instead of decisional).
Definition 3. A search problem family is considered hard with respect to a given probability distribution Dλ of the instances if for any probabilistic polynomial time algorithm B
Prob(B(x)∈Sol(x) | x←$ Dλ) ∈negl(λ)
where Sol(x) is the set of solutions of the problem instance x.
The previous probability is called the success probability of B, and it is sometimes also called the advantage of B.
Example 2 (The Computational Diffie Hellman Problem (CDH)). Using the same instance generator I as in DDH, the CDH problem associated to I is the following:
Given (G,q,g) obtained from I(λ), and the group elements gu, gv, for uniformly distributed u,v∈Zq, compute guv.
The CDH Assumption associated to the instance generator I says that the success probability of any probabilistic polynomial time in solving the CDH problem is a negligible function in λ.
If Sol(x) has more than one element (for most problem instances) the search problem family is sometimes called flexible.
Example 3 (A flexible version of CDH). Using the same instance generator I as in DDH, given (G,q,g) obtained from I(λ), and the group elements gu, gv, for uniformly distributed u,v∈Zq, compute a pair (h,huv) for any generator h of G.
In complexity theory the hardness of two different problem families can often be related via what is called a reduction. A reduction R from the problem family P1 to another problem family P2 is a transformation that when applied to a probabilistic polynomial time algorithm B2 successfully solving P2, it gives another probabilistic polynomial time algorithm B1 = R[B2] that successfuly solves P1.
When a reduction R from P1 to P2 exists we say that P1 is reducible to P2, and we denote this as P1⇒P2. Notice that the previous reducibility assertion proves the hardness implication “P1 is hard implies that P2 is also hard”. Assume otherwise that P1 is hard but P2 is not. Then, there would exist a probabilistic polynomial time algorithm B2 successfully solving P2. Therefore, R[B2] would be also a probabilistic polynomial time algorithm successfully solving P1, which would be a contradiction.
The statement P1⇒P2 reads both as “the problem P1 is not harder that P2” and “‘P1 is hard’ is not a weaker assumption than ‘P2 is hard’”.
In a typical reduction, R is an oracle algorithm that receives an instance of P1, then it generates one or more instances of P2 that are submitted to one or several instances of B2, and finally, it uses the answers given by these instances to built its own answer to the received P1 instance.
The reduction works if:
For any B2 running in polynomial time, R[B2] also runs in polynomial time.
For any B2 with a non-negligible advantage solving P2, R[B2] also has a non-negligible advantage solving P1,
A reduction is called Black-Box if it only uses B2 as an oracle, that is, is has no access to the “internals” of B2 and it only uses its outputs for conveniently selected inputs. We usually write R[B2]=RB2 to denote this oracle access.
Example 4 (DDH⇒CDH). Let us show an explicit black-box reduction R that uses an algorithm B solving CDH problem to solve DDH problem.
R receives as input the tuple x=(G,q,g,U,V,W). Then it calls B on the input of (G,q,g,U,V) obtaining some group element W'. If W=W' then R outputs 1. Otherwise, R outputs 0.
The running times of B and RB are nearly the same (the time associated to a comparison or to the handling of the input and output parameters is usually negligible compared to performing the nontrivial task of solving CDH).
To analyze the advantage of RB, first consider the case b=1 (that is, there are uniformly distributed u,v∈Zq such that U=gu, V=gv and W=guv). Whenever B gives the right answer, W'=guv, so does RB, because W=W' and then the output given by RB is exactly b. Therefore,
Prob(RB(x)=1 | x←$ D1,λ) ≥ SuccCDHB(λ),
where SuccCDHB(λ) is the success probability of B solving CDH.
On the other hand, for b=0 there are uniformly distributed u,v,w∈Zq such that U=gu, V=gv and W=gw. Therefore, W is a uniformly distributed group element that is independent of U,V, and hence it is also independent to the answer W' given by B. Thus,
Prob(RB(x)=1 | x←$ D0,λ) = 1/q,
and
AdvDDHRB(λ) ≥ |SuccCDHB(λ) - 1/q|
where AdvDDHRB(λ) is the advantage of RB solving DDH.
Observe that 1/q is of the order of 2-λ and then it is a negligible function. Therefore, a non-negligible SuccCDHB(λ) also implies a non-negligible AdvDDHRB(λ).
The previous reduction is called tight because RB performs as well as B in terms of running time and advantage.
Example 5 (flex-CDH⇒CDH). In this case, there is a trivial black-box tight reduction because both CDH and flex-CDH instances have identical descriptions, and the solution to CDH contains a solution to flex-CDH. Namely, if W is a valid solution to the CDH instance x=(G,q,g,U,V), then (g,W) is a valid solution to the corresponding flex-CDH instance.
There is no other known reduction relating the three problems DDH, CDH and flex-CDH. Notice than DDH gives only a single bit, and then a reduction from CDH to DDH would require solving nearly λ DDH instances to find a solution to a CDH instance. On the other hand, an algorithm solving flex-CDH has total freedom to choose the generator h. Therefore, it seems very hard to translate the pair (h,huv) into guv without knowing the discrete logarithm of h respect to g.
In the provable security paradigm, the claim “a cryptosystem is secure against some specific attack” is encoded as the hardness of certain problem family capturing the meaning of the attack. Since absolute security proofs are not known and they are not expected to be found, a security proof is always relative to some extra assumptions. Thus, a security proof is a reduction from a problem family P conjectured to be hard to a problem family associated to a specific attack to a specific cryptosystem.
Example 6 (“ElGamal key recovery ⇒ Discrete log problem easy”). Lets assume that a key recovery attack consists of giving a honestly generated public key to an attacker who then obtains the secret key from it.
Formally, the attacker B receives an instance (G,q,g,y), where y is a random group element, and then it outputs x∈Zq such that y=gx.
But thus us exactly the description of the discrete logaritm problem on (G,q,g). Therefore, the reduction is trivial since the two problem families are exactly the same.
A formal definition of a security notion usually involves the detailed description of a specific attack, including the goal of the attacker and the resources it has access to. These resources can include the interaction of the attacker with other parties running the cryptographic protocol, that consists of different communication rounds. As a consequence, the problem families associated to security notions have a description far more complex than the simple examples given above (CDH, DDH. ...).
The usual way to define a security notion as a problem family is to specify the interaction of the attacker to the rest of the world via a random experiment performed by the attacker itself and a challenger (simulating the rest of the world). The challenger starts the conversation giving some input to the attacker. Then, the attacker sequentially queries the challenger to obtain some extra information, and eventually, the attacker gathers all the information received and ends the experiment giving some output to the challenger. Finally, the challenger processes all the information including the attacker's output and then it decides its own output, that is “yes” or “no” (or 1 or 0) indicating the success of the attacker.
This experiment is usually called a security game.
Example 7 (Key Recovery security game).
The game is played between a challenger Ch and an attacker A for a public key encryption scheme PKE=(KeyGen(),Enc(),Dec())
Ch runs (pk,sk) = KeyGen(λ)
Ch sends pk to A
A outputs sk'
Ch outputs 1 if sk'=sk, or 0, otherwise.
We will describe the security games formally as pseudocode run by the challenger, which sends and receives information from the attacker A.
Game KeyRec(PKE,A,λ) { (pk,sk) ← PKE.KeyGen(λ) sk' ← A(pk) if sk'=sk output 1 else output 0 endif }
AdvKeyRec(PKE,A,λ) = Prob[KeyRec(PKE,A,λ)=1]
Formally, we say that PKE is secure against KeyRec attacks if for any probabilistic polynomial time attacker A, AdvKeyRec(PKE,A,λ)∈negl(λ). We will denote this assertion as KeyRec(PKE).
The previous security game example captures a vey basic attack against a public key encryption scheme. In this section, more realistic attacks are formally defined.
One-wayness of a public key encryption scheme refers to the infeasibility of inverting the encryption function without knowing the secret key. It is formalized in the following game:
Game OW-CPA(PKE,A,λ) { (pk,sk) ← PKE.KeyGen(λ) m ←$ PKE.Mpk c* ← PKE.Enc(pk,m) m' ← A(pk,c*) if m'=m output 1 else output 0 endif }
AdvOW-CPA(PKE,A,λ) = Prob[OW-CPA(PKE,A,λ)=1]
In this game the attacker receives just the public key and the encryption of a random message that it tries to guess. Obviously, if the attacker is able to launch a successful key recovery attack, then it easily wins the game by decrypting c* with the learned secret key. However, there could exist other ways to decrypt a particular ciphertext without learning the secret key.
The name ‘OW-CPA’ comes from “One-Wayness under a Chosen Plaitext Attack”, meaning that the attacker has only access to the encryption function and then it can observe ciphertexts corresponding to messages of its choice. This is the minimal set of capabilities an attacker can have in a public key encryption scheme, in contrast to what happens with a symmetric key encryption scheme, where the encryption functionality depends on the secret key and it is not accesible to the attacker.
In the previous security notion an attacker still could recover half of the message without breaking the one-wayness of the public key encryption scheme. Semantic security is a stronger security notion that requires the attacker to be unable to learn any useful information about the message from the ciphertext, other than the information that it could directly get without it (e.g., the length of the message).
This notion is quite hard to formalize. However, it has been shown to be equivalent to the notion of indistinguishability of ciphertexts, formally described below:
Game IND-CPA(PKE,(A0,A1),λ) { (pk,sk) ← PKE.KeyGen(λ) (m0,m1,st) ← A0(pk) if m0∉PKE.Mpk or m1∉PKE.Mpk output 0 endif b ←$ {0,1} c* ← PKE.Enc(pk,mb) b' ← A1(c*,st) if b'=b output 1 else output 0 endif }
AdvIND-CPA(PKE,(A0,A1),λ) = |2Prob[IND-CPA(PKE,(A0,A1),λ)=1]-1|
Here, the attacker A is split into two different stages (A0,A1). The first stage receives and processes the public key and outputs two carefully selected (valid) messages in the message space for the given public key. The first attacker stage communicates with the second one by means of a binary string st, which format and length depend entirely on the attacker. The string st can be seen as a representation of the internal state of the first stage (e.g., including the public key), that will be used to initialize the second stage. This communication using states is the common model to formalize a stateful entity that is sequentially interacting with other entities.
The second stage of the attacker receives (in addition to the state information st) a ciphertext of one of the two selected messages. Then it tries to guess which one has been encrypted.
The indistinguishability game guarantees that even when the attacker knows everything about the message except for a single bit of information, it has no noticeable advantage compared to a lazy attacker that justs flips a coin to decide the missing message bit.
Observe that if an attacker is able to successfully launch a OW-CPA attack, then it will win the IND-CPA game. Indeed, the attacker can use the OW-CPA attack to learn the message mb from the ciphertext c, and then infer b comparing mb to the two messages selected in the first stage.
On the other hand, the IND-CPA security notion is unachievable by any deterministic encryption scheme like RSA. Indeed, the attacker can trivially win the game by just comparing the received ciphertext to the unique encryption of m0 or m1.
In all previous attack models the attacker only eavesdrops at the insecure communication channel between the sender and the receiver, learning public keys and ciphertexts. However, some practical attacks are based on the injection of malicious messages or ciphertexts so that the attacker can get crucial information from the reactions of the communicating entities.
Active attacks consider the possibility of injection, replacement or removal of messages from the insecure communication channel, while adaptive attacks allow to split the attack into several rounds in a way that the attacker can decide its strategy according to the information received in previous steps.
In the case of public key encryption schemes, the most important active and adaptive attack is the Chosen Ciphertext Attack (CCA). Namely, the adversary can sequentially submit ciphertexts (well-formed or not) do the decryption function and learn the message contained on them (or the failure or rejection of some badly-formed ciphertexts). This ability is formally modelled as a decryption oracle given to the attacker.
The access to the decryption oracle can happen before or after receiving the challenge (that is, the encryption of a random message, in the case of one-wayness, or the encryption of one of the two selected messages, in the case of indistinguishability). To define meaningful security notions, the decryption oracle access after receiving the challenge must be limited. Otherwise, if the attacker is allowed to query the oracle with the challenge ciphertext, it will surely win the security game.
We give below the security games defining the chosen ciphertext security versions of one-wayness and ciphertext indistinguishability:
Game OW-CCA1(PKE,(A0,A1),λ) { (pk,sk) ← PKE.KeyGen(λ) st ← A0D0(pk) m ←$ PKE.Mpk c* ← PKE.Enc(pk,m) m' ← A1(c*,st) if m'=m output 1 else output 0 endif } Oracle D0(c) { output PKE.Dec(pk,sk,c) }
AdvOW-CCA1(PKE,(A0,A1),λ) = Prob[OW-CCA1(PKE,(A0,A1),λ)=1]
Game OW-CCA2(PKE,(A0,A1),λ) { (pk,sk) ← PKE.KeyGen(λ) st ← A0D0(pk) m ←$ PKE.Mpk c* ← PKE.Enc(pk,m) m' ← A1D1(c*,st) if m'=m output 1 else output 0 endif } Oracle D0(c) { output PKE.Dec(pk,sk,c) } Oracle D1(c) { if c=c* reject else output PKE.Dec(pk,sk,c) endif }
AdvOW-CCA2(PKE,(A0,A1),λ) = Prob[OW-CCA2(PKE,(A0,A1),λ)=1]
Game IND-CCA1(PKE,(A0,A1),λ) { (pk,sk) ← PKE.KeyGen(λ) (m0,m1,st) ← A0D0(pk) if m0∉PKE.Mpk or m1∉PKE.Mpk output 0 endif b ←$ {0,1} c* ← PKE.Enc(pk,mb) b' ← A1(c*,st) if b'=b output 1 else output 0 endif } Oracle D0(c) { output PKE.Dec(pk,sk,c) }
AdvIND-CCA1(PKE,(A0,A1),λ) = |2Prob[IND-CCA1(PKE,(A0,A1),λ)=1]-1|
Game IND-CCA2(PKE,(A0,A1),λ) { (pk,sk) ← PKE.KeyGen(λ) (m0,m1,st) ← A0D0(pk) if m0∉PKE.Mpk or m1∉PKE.Mpk output 0 endif b ←$ {0,1} c* ← PKE.Enc(pk,mb) b' ← A1D1(c*,st) if b'=b output 1 else output 0 endif } Oracle D0(c) { output PKE.Dec(pk,sk,c) } Oracle D1(c) { if c=c* reject else output PKE.Dec(pk,sk,c) endif }
AdvIND-CCA2(PKE,(A0,A1),λ) = |2Prob[IND-CCA2(PKE,(A0,A1),λ)=1]-1|
IND-CCA1 is informally called “lunchtime attack” because the attacker eventually gains access to the decryption functionality (but not to the secret key) at some point before receiving the challenge ciphertext. IND-CCA2 is normally referred as IND-CCA, and it is considered nowadays as the standard security level for a general purpose public key encryption scheme.
We summarize here some of the generic implications between security notions mentioned before and also add some new ones.
OW-CCA2(PKE) ⇒ OW-CCA1(PKE) ⇒ OW-CPA(PKE),
where OW-CCA2(PKE) means the assertion “PKE is secure against a OW-CCA2(PKE) attack”, or equivalently, AdvOW-CCA2(PKE,(A0,A1),λ)∈negl(λ) for all probabilistic polynomial time algorithms A0 and A1.
The leftmost implication comes from the fact that an attacker that wins the OW-CCA1 game also wins the OW-CCA2 by just ignoring the decryption oracle in the second stage. Similarly, for the rightmost implication, an attacker A capable of winning the OW-CPA game can be transformed into a trivial first stage A0 that ignores the decryption oracle and sets st=pk, and the second stage is just A1=A.
IND-CCA2(PKE) ⇒ IND-CCA1(PKE) ⇒ IND-CPA(PKE).
These implications are proved in the same way, letting the attackers ignore the extra decryption oracle provided in the game.
IND-atk(PKE) ⇒ OW-atk(PKE) for any atk∈{CPA,CCA1,CCA2}.
Here we are assuming that the message space has a super-polynomial size (otherwise, OW security cannot be achieved). In any of these implications, an attacker (A0,A1) winning the OW-atk game can be transformed (in a black-box way) into another (B0,B1) winning the IND-atk game. Indeed, B0 runs stA ← A0D0(pk) (or just sets stA to the empty string if atk=CPA), then chooses two random different messages m0,m1∈PKE.Mpk, and outputs (m0,m1,st), where st=(stA,m0,m1). In the second stage, B1 runs A1 (or A, if atk=CPA) with the same oracles it received, using as inputs the received stA and the challenge c*. Then it compares A's output m' to m1. It they are equal than B1 outputs 1. Otherwise, it outputs 0.
This reduction can be expressed more precisely writting it as pseudocode. For instance, for atk=CCA2:
B0D0(pk) { stA ← A0simD0(pk) m0 ←$ PKE.Mpk m1 ←$ PKE.Mpk∖{m0} st = (m0,m1,stA) output (m0,m1,st) } Oracle simD0(c) { output D0(c) } B1D1(c*,st) { parse st = (m0,m1,stA) m' ← A1simD1(c*,stA) if m'=m1 output 1 else if m'=m0 output 0 else output a random bit endif } Oracle simD1(c) { output D1(c) }
Observe that the reduction (B0,B1) has to answer all oracle queries made by (A0,A1). But in this case, the simulated oracles given to (A0,A1) behave identically as the real oracles given to (B0,B1). Hence, the simulated oracles simply forward the queries and answers to and from the real oracles.
No deterministic encryption can achieve IND-CPA security (and then, neither IND-CCA1 nor IND-CCA2), but it is still possible that some of them achieve OW-CCA2 security.
None of the basic encryption schemes described in the last chapter can resist IND-CCA2 attacks or even OW-CCA2 attacks. For instance, an attacker against OW-CCA2(RSA) can compute c=2ec* mod n, and submit c to the decryption oracle D1. Since c≠c*, the oracle will return 2m mod n, thus revealing the message m to the attacker.
Another example is ElGamal encryption. From the challenge ciphertext c*=(gr,yrm), the attacker computes a different ciphertext of the same message c=c*(g,y)=(gr+1,yr+1m). Then, the decryption oracle will directly give the message m to the adversary.
Achieving IND-CCA2 security was a very challenging task that required decades of research to come up with some realistic solutions (specially, for security proofs in the Standard Model, that is, whithout assuming the existence of Random Oracles).
Security models for digital signatures are defined in a similar way as it is done for public key encryption schemes. However, the security goals in digital signatures are very different, compared to encryption schemes: Here, the attacker must be unable to forge any signature that has not been generated by the legitimate signer. More precisely, the attacker must be unable to find new solutions to the signature verification equation without the knowledge of the signing key.
In a universal unforgeability (UF) scenario the attacker must be unable to find a valid signature for a random message not chosen by itself. The minimal attacker capabilities are a key-only attack (KOA), in which the attacker only knows the public (verification) key. In a more realistic attack, the attacker has also access to a list of valid pairs message/signature for randomly selected messages (RMA).
More formally, assuming a digital signature scheme DS = (KeyGen(),Sig(),Ver()):
Game UF-KOA(DS,A,λ) { (pk,sk) ← DS.KeyGen(λ) m* ←$ SK.Mpk (s*) ← A(pk,m*) output DS.Ver(pk,m*,s*) }
AdvUF-KOA(DS,A,λ) = Prob[UF-KOA(DS,A,λ)=1]
Game UF-RMA(DS,A,λ) { (pk,sk) ← DS.KeyGen(λ) m* ←$ SK.Mpk (s*) ← AS(pk,m*) output DS.Ver(pk,m*,s*) } Oracle S() { m ←$ SK.Mpk∖{m*} output (m,DS.Sig(sk,m)) }
AdvUF-RMA(DS,A,λ) = Prob[UF-RMA(DS,A,λ)=1]
In the existential unforgeability (EF) scenario the attacker wins the game if it is able to find any valid pair message/signature for a message of its choice, but not signed before.
Game EF-KOA(DS,A,λ) { (pk,sk) ← DS.KeyGen(λ) (m*,s*) ← A(pk) output DS.Ver(pk,m*,s*) }
AdvEF-KOA(DS,A,λ) = Prob[EF-KOA(DS,A,λ)=1]
Game EF-RMA(DS,A,λ) {
lm ← List.Empty
(pk,sk) ← DS.KeyGen(λ)
(m*,s*) ← AS(pk)
if DS.Ver(pk,m*,s*) and m*∉lm
output 1
else output 0
endif
}
Oracle S() {
m ←$ SK.Mpk
lm.AddToList(m)
output (m,DS.Sig(sk,m))
}
AdvEF-RMA(DS,A,λ) = Prob[EF-RMA(DS,A,λ)=1]
In the last game, the oracle is “stateful”, that is it maintains an internal state consisting of the list lm of messages queried to the oracle. This is necessary to allow the challenger to check the freshness of the attacker's forgery in the last step of the game.
There is a stronger version of EF-RMA in which a forgery (m*,s*) for m*∈lm is still accepted as valid unless it is exactly one of the pairs returned by the oracle. Namely, forging a second signature for a message for which a signature is known is considered a successful attack.
Notice that if a signature scheme is deterministic (i.e., there exists a unique signature for a given message and a given public key) there is no difference between the stronger and the plain versions of EF-RMA.
In order to write a formal game for the strong EF-RMA, the list lm is replaced by a list lms, storing the message/signature pairs returned by the oracle.
As in the case of encryption schemes, in a realistic scenario an attacker could inject some messages and persuade the signer to produce a valid signature on it. This model is called Chosen Message Attack (CMA). See below the corresponding security games for the two goals UF and EF:
Game UF-CMA(DS,A,λ) { lm ← List.Empty (pk,sk) ← DS.KeyGen(λ) m* ←$ SK.Mpk s* ← AS(pk,m*) output DS.Ver(pk,m*,s*) and m*∉lm } Oracle S(m) { lm.AddToList(m) output DS.Sig(sk,m) }
AdvUF-CMA(DS,A,λ) = Prob[UF-CMA(DS,A,λ)=1]
Game EF-CMA(DS,A,λ) {
lm ← List.Empty
(pk,sk) ← DS.KeyGen(λ)
(m*,s*) ← AS(pk)
if DS.Ver(pk,m*,s*) and m*∉lm
output 1
else output 0
endif
}
Oracle S(m) {
lm.AddToList(m)
output DS.Sig(sk,m)
}
AdvEF-CMA(DS,A,λ) = Prob[EF-CMA(DS,A,λ)=1]
As in the case of EF-RMA, there is a strong version of EF-CMA defined in the same way.
EF-CMA (or the strong version of it) is the commonly used standard level of security for a general purpose digital signature scheme.
There exist similar generic implications between security notions for signatures to the case of encryption schemes. Namely, CMA security implies RMA security, which in turn implies KOA security for both UF and EF goals.
In the first implication, the RMA oracle can be simulated calling the CMA oracle on randomly selected messages. For the second implication, the adversary simply ignores the oracle.
On the other hand, EF-atk security implies UF-atk security for any atk∈{KOA,RMA,CMA}. Indeed, any valid UF forgery (that is, a valid message/signature pair for the message selected by the challenger) can be used as a valid EF forgery. In particular, the EF adversary simulates the UF challenger by generating a random message and asking the UF adversary for a signature on it.
Moreover, the strong EF-atk security also implies the plain one, because any plain EF-atk forgery is also considered as a valid strong EF-atk forgery.
In this section we describe some useful techniques for building security proofs. A security proof claiming that a particular cryptosystem achieves certain security level is often based on a reduction from a well-known problem family that is believed to be hard to the problem family associated to the formal game defining this security level.
A very easy example is proving that IND-CPA(ElGamal) is implied by the DDH Assumption. The proof consists of giving an explicit reduction from the DDH Problem to the IND-CPA(ElGamal) problem. The particular proof described below is generic, meaning that it is valid for any instance generator (i.e., for any way to choose the group G and its generator g). This does not mean thay IND-CPA(ElGamal) holds for any group family, but at least it holds whenever the DDH Assumption is true.
Let I() be the instance generator used in ElGamal encryption, that on the input of λ∈Z+ it outputs a tuple (G,q,g) where G is a cyclic group of prime order q, g is a generator of G and the bitlength of q is λ. The reduction is a black-box one, and it works as follows:
BBReduction B(A0,A1)(G,q,g,U,V,W) { param ← (G,q,g) pk ← (param,U) (m0,m1,st) ← A0(pk) if m0∉G or m1∉G output 0 endif b ←$ {0,1} c* ← (V,W mb) b' ← A1(c*,st) if b'=b output 1 else output 0 endif }
Observe that B(A0,A1) is seen by the attacker (A0,A1) as its own challenger in the game IND-CPA(ElGamal), while it is also seen as the attacker in the following game formally defining DDH:
Game DDH(I,B,λ) {
(G,q,g) ← I(λ)
b ←$ {0,1}
u,v ← Zq
U ← gu
V ← gv
if b=1
W ← guv
else
W ←$ G
endif
b' ← B(G,q,g,U,V,W)
if b'=b output 1
else output 0
}
AdvDDH(I,B,λ) = |2Prob[DDH(I,B,λ)=1]-1|
Sometimes it is useful to split Game DDH into two separate games: DDH-0 and DDH-1, where DDH-b means that b is preset to the given value instead of choosing it at random. Clearly,
AdvDDH(I,B,λ) = |Prob[DDH-0(I,B,λ)=1] + Prob[DDH-1(I,B,λ)=1] - 1|
To proceed with the security proof by reduction, we start considering the case b=1, that is, we compute Prob[DDH-1(I,B,λ)]=1.
Let us define the view of an attacker A in a security game G, denoted as ViewA(G(A,λ)), as the sequence of random variables consisting of the inputs received from the challenger and the outputs sent by the attacker. When some extra oracles are given to the attacker, ViewA(G(A,λ)) also includes the corresponding oracle queries and responses. We say that two games G1 and G2 are identical for an attacker A if ViewA(G1(A,λ)) and ViewA(G2(A,λ)) are identically distributed random variables.
In our case, the two views ViewA0,A1(IND-CPA(ElGamal,(A0,A1),λ)) = (G,q,g,y,m0,m1,st,c*,b') and ViewA0,A1(DDH-1(I,BA0,A1,λ)) = (G,q,g,U,m0,m1,st,c*,b'), that is the view of A in the composition of the DDH-1 challenger and the reduction B, are identically distributed. Indeed, (G,q,g) is drawn from I(λ) in both games, and y or U are just uniformly distributed random group elements. Therefore, A0 receives in both games identically distributed inputs, then so are its outputs (m0,m1,st).
Moving forward in the games, the challenge ciphertexts are generated identically, since the variables (V,W) in DDH-1 can be identified to (gr,yr), and the message encrypted in the ciphertext is selected in both games at random from the two messages produced by A0.
As a consequence, A1 also receives identically distributed inputs in both games (even conditioned to the previous components in the view), and then its output b' is also identically distributed in both games.
From the above considerations, we conclude that the event b'=1 in Game DDH-1(I,BA0,A1,λ) and the event b'=b in Game IND-CPA(ElGamal,(A0,A1),λ) occur with the same probability, and then
Prob[DDH-1(I,BA0,A1,λ)=1] = Prob[IND-CPA(ElGamal,(A0,A1),λ)=1].
The previous analysis is often summarized as the assertion “the reduction B in Game DDH-1 perfectly simulates the Game IND-CPA(ElGamal) for A”.
It remains the analysis of the game DDH-0, in which the behavior is dramatically different: The two views ViewA0,A1(IND-CPA(ElGamal,(A0,A1,λ)) and ViewA0,A1(DDH-0(I,BA0,A1,λ)) have different probability distributions. Indeed, in the second game W is independent of (G,q,g,U,m0,m1,st,V), and therefore the second component of c* perfectly hides the encrypted message. Hence, it also perfectly hides the bit b used to select that message. This means that A1 receives no information about this bit, and then
Prob[DDH-0(I,BA0,A1,λ)=1] = 1/2.
Summarizing the two results:
AdvDDH(I,BA0,A1,λ) = |Prob[DDH-0(I,BA0,A1,λ)=1] + Prob[DDH-1(I,BA0,A1,λ)=1] - 1| =
=|Prob[IND-CPA(ElGamal,(A0,A1),λ)=1]-1/2| = AdvIND-CPA(ElGamal,(A0,A1),λ)/2.
This concludes the proof by reduction, because if DDH Assumption holds, then the above advantage must be negligible for any (A0,A1), what means that ElGamal encryption achieves IND-CPA security.
The last example of security proof is actually a very simple case: there are three games involved: DDH-0, DDH-1 and IND-CPA(ElGamal), and the proof consists of plugging DDH-1 into IND-CPA(ElGamal) in a way that shows that they are indeed the same game. Next, DDH-0 can be seen as a slight modification of DDH-1 (both games differ only in the way W is generated), but when replacing the copy of DDH-1 embedded into IND-CPA(ElGamal) by a copy of DDH-0, the resulting game gives advantage exactly 0 to any attacker (that is, perfect secrecy). As a consequence, any attack against IND-CPA(ElGamal) also serves to tell apart games DDH-0 and DDH-1, which would mean that DDH problem is easy.
This reasoning line can be rewritten as a sequence of games, (Game0,...,Gamen), with the following properties:
The first game in the sequence captures the security notion to be proved.
The last game is a trivial game, in which any attacker has zero advantage.
The “distance” between any two consecutive games in the sequence can be bounded.
From the triangle inequality, the distance between the first and the last games in the sequence is upper bounded by the sum of all intermediate distances, which is also an upper bound of the advantage of an attacker in the first game, since the last game in the sequence is trivial.
More formally,
AdvGame n(A,λ) = 0
AdvGame 0(A,λ) ≤ |AdvGame 1(A,λ)-AdvGame 0(A,λ)| + ... + |AdvGame n(A,λ)-AdvGame n-1(A,λ)|
In the IND-CPA(ElGamal) example, the sequence has two games: the first one is IND-CPA(ElGamal) and the second one is a trivial game. The distance of the two games is upper bounded by twice the advantage of an attacker against DDH, which we assume to be a negligible function.
We can consider three types of game hops in a sequence of games:
A trivial hop: Games i and i+1 are identical random experiments, that is, the views of A in both games are identically distributed. Thus,
AdvGame i(A,λ) = AdvGame i+1(A,λ).
This step usually corresponds to a way to rewrite a game in a more convenient way for subsequent steps.
A statistical hop: Games i and i+1 are almost identical. More precisely, there exist events Fi in Game i and Fi+1 in Game i+1 with identical and negligible probability pF, such that the view of A in Game i conditioned to “Fi does not occur” is identically distributed to its view in Game i+1 conditioned to “Fi+1 does not occur”. In this case,
|AdvGame i+1(A,λ) - AdvGame i(A,λ)| ≤ pF ∈negl(λ).
A computational hop: Any attacker A can be transformed into another one B solving a computational problem P with advantage bounded by a polynomial in |AdvGame i+1(A,λ) - AdvGame i(A,λ)|. In this case, we will need the assumption that the computational problem P is hard in order to conclude the security proof.
The result in the second type of game hop is based on a basic fact about random experiments, often called the Difference Lemma:
Lemma 1 (Difference Lemma). Let us consider two random experiments Exp1 and Exp2 producing two probability distributions with a common support X. If there exist an event F (defined as a predicate on X) that occurs with probability pF in both experiments, such that Exp1 conditioned to “F does not occur” is identical to Exp2 conditioned to “F does not occur”, then for any event A (also defined as a predicate on X), |ProbExp1[A] - ProbExp2[A]| ≤ pF.
Proof.
|ProbExp1[A] - ProbExp2[A]| =
|ProbExp1[A | F]ProbExp1[F] - ProbExp2[A | F]ProbExp2[F]| =
=|(ProbExp1[A | F] - ProbExp2[A | F])| pF ≤ pF.
In the literature more sophisticated structures than a sequence of games are considered, but the key idea is providing a way that the initial security game is modified in small steps so that every change can be analyzed in isolation to the others, even if dozens of steps are needed to complete the proof.
The length of a sequence of games used in a security proof is not necessarily a fixed constant. Indeed, it can depend on some parameters of the cryptosystem analyzed.
A classical example is a generic reduction from the IND-CPA security of an encryption scheme to the IND-CPA security of the same encryption scheme used to encrypt vectors of n messages. For instance, one can use ElGamal to encrypt a vector (m1,...,mn) as (gr1,yr1m1,...,grn,yrnmn). The question is how secure is encrypting vectors if we only know that ElGamal is secure for single messages.
There is a generic proof that works for any IND-CPA secure encryption scheme for single messages. In the IND-CPA security definition for tha n-fold version of the basic encryption scheme (with a single key for all the n components)
Encn(pk,(m1,...,mn)) = (Enc(pk,m1),...,Enc(pk,mn))
the attacker's goal is telling apart the encryptions of two vectors of n messages chosen by itself.
Consider the sequence of games (Game 0,...,Game n) such that Game i is like the IND-CPA security game for the n-fold encryption scheme but the first i components of the vector of messages encrypted in the challenge ciphertext are chosen from the first vector given by the adversary, and the last n-i are chosen from the second. In particular, the original IND-CPA game measures the difference between Games 0 and n.
The proof consists of a reduction from the IND-CPA security of the basic scheme to the difference of any two consecutive games in the sequence. As a consequence, the advantage of an attacker against IND-CPA of the n-fold encryption is at most n times the advantage of another attacker against the IND-CPA of the basic encryption scheme. This result gives an upper bound to the security degradation when the encrytion scheme is used to encrypt vectors instead of single elements.
Let us see how the reduction works. The attacker against IND-CPA of the basic scheme (Bn,i,0,Bn,i,1) uses the attacker (A0,A1) that tries to tell apart Game i and Game i+1. The former attacker just uses the i-th copy of the basic scheme to make the latter decide the message encrypted in the challenge ciphertext.
BBReduction Bn,i,0(A0,A1)(pk) {
(m0,m1,stA) ← A0(pk)
for each j in {1,...,i-1}
cj ← Enc(pk,m0,j)
endfor
ci ← ⊥
for each j in {i+1,...,n}
cj ← Enc(pk,m1,j)
endfor
st ← (stA,c)
output (m0,i,m1,i,st)
}
BBReduction Bn,i,1(A0,A1)(c*,st) {
parse st = (stA,c)
ci ← c*
b' ← A1(c,stA)
output b'
}
The simulation of Game i is perfect when b=0 in the IND-CPA game, while Game i+1 is perfectly simulated when b=1 in the IND-CPA game.
The intermediate games are called hybrid games because they “interpolate” between the first and the last games in the sequence.
In some cases, the last result given in the previous section can be improved and the proof works without the need if hybrid games. This is the case of ElGamal encryption. The trick is related to a property called random self-reducibility, that refers to the ability of transforming any arbitrary instance of a problem family to a uniformly distributed one with a probabilistic polynomial time algorithm. Diffie-Hellman problems (both the decisional and the computational ones) enjoy this property. For instance, in the decisional case, the following transformation:
(G,q,g,U,V,W) ↦ (G,q,g,U',V',W') = (G,q,g,Uαgβ,Vγgδ,WαγUαδVβγgβδ);
for α,γ←$Zq× and β,δ←$Zq
maps any instance in the support of the distribution Db,λ (see Example 1 in Section 1.2) into a random instance following the distribution Db,λ, for both b=0 and b=1. Indeed, if U=gu, V=gv, W=gw, U'=gu', V'=gv' and W'=gw', we can write
u' = αu+β
v' = γv+δ
w' = u'v'+αγ(w-uv)
and then it is straightforward to show that (u',v',w') follows the right distribution for both values of b. Interestingly, (u',v',w') and (u,v,w) conditioned to b=0 or b=1 are independent random vectors.
This last observation makes possible to “amplify” the difference between the distributions D0,λ and D1,λ to their n-fold version with the same kind of transformation:
(G,q,g,U,V,W) ↦ (G,q,g,U1,V1,W1,...,Un,Vn,Wn)
where (Ui,Vi,Wi) = (Uαigβi,Vγigδi,WαiγiUαiδiVβiγigβiδi); for αi,γi←$Zq× and βi,δi←$Zq
This trick makes possible to prove the IND-CPA security of the n-fold version of ElGamal encryption with a constant number of games (i.e., a number of games independent of n). The idea is transforming a DDH challenge (U,V,W) into a challenge vector (U1,V1,W1,...,Un,Vn,Wn) and use the latter to produce a valid encryption of a message vector m given by the adversary (when the DDH tuple follows the distribution D1) or a valid encryption of a random message vector independent of m (when the DDH tuple follows the distribution D0). We actually need here a special case of the transformation in which all Ui are equal to U, that is, all αi=1 and all βi=0. Therefore, the reduction works exactly in the same way as in the plain ElGamal encryption.
BBReduction B(A0,A1)(G,q,g,U,V,W) {
param ← (G,q,g)
pk ← (param,U)
(m0,m1,st) ← A0(pk)
if m0∉Gn or m1∉Gn output 0
endif
b ←$ {0,1}
for each i in {1,...,n}
γi←$Zq×
δi←$Zq
Vi ← Vγigδi
Wi ← WγiUδi
c*,i ← (Vi,Wimb,i)
endfor
b' ← A1(c*,st)
if b'=b output 1
else output 0
endif
}
With this reduction, we obtain
AdvDDH(I,BA0,A1,λ) = AdvIND-CPA(ElGamaln,(A0,A1),λ)/2.
When formally defining a problem family there is some freedom to choose the notion of hardness. For instance, in the definition of DDH, a DDH instance is a tuple (G,q,g,U,V,W) drawn from one of two probability distributions that the attacker wants to tell apart. The freedom comes from the way we compute the attacker's advantage.
In the definition given in previous sections, this advantage is averaged among all problem instances that corresponds to a particular value of the parameter λ, and the instance probability distribution comes from the instance generator I(). Thus, even when DDH achieves this notion of hardness, there could exist a few weak problem instances (provided they occur with a small enough probability). We call this kind of definition a high granularity hardness notion.
Finer levels of granularity are achieved when we ask the advantage of the attacker to be negligible for any choice of the group G in the range of the instance generator (medium granularity), or even for every particular generator of the group (low granularity). More precisely, in the medium granularity setting the attacker's advantage function is defined for every value of λ as the maximum of the advantages for each possible choice of the group G in the range of I(λ), but it is still averaged using the randomness contained in the generator and the components (U,V,W).
In the low granularity setting, the advantage is maximized for every possible choice of (G,q,g), and then the averaging is only made with respect of the randomness contained in the components (U,V,W).
Note that the finer the granularity, the stronger is the hardness assumption. If a problem family is believed to be hard in the low granularity setting, then the values of (G,q,g) can be standarized for each value of λ, and then they can be shared among many users without any security degradation (this is actually the case of ElGamal encryption). However, some known reductions between problems related to DDH does not preserve the generator g, and then they cannot directly be applied in this setting.
Consider the following variant (actually, a particularization) of the CDH Problem, called Square in the Exponent (SE) Problem:
Example 8 (The Square in the Exponent Problem (SE)). Using the same instance generator I() as in DDH, the SE problem associated to I() is the following:
Given (G,q,g) obtained from I(λ), and the group element gu, for a uniformly distributed u∈Zq, compute gu2.
The SE Assumption associated to the instance generator I() says that the success probability of any probabilistic polynomial time in solving the SE problem is a negligible function in λ.
There exist reductions between CDH and SE in either way, so that we can say that the two associated hardness assumptions are equivalent. We show below the two explicit reductions. The first one is a black-box reduction from SE to CDH:
BBReduction BACDH(G,q,g,U) { α ←$ Zq× output ACDH(G,q,g,U,Uα) }
Observe that simply taking α=1 does not work, because then the CDH instance received by ACDH is not properly distributed and hence, we do not know any bound of the attacker's advantage (it can be non-negligible on a negligible fraction of the instances without contradicting the CDH Assumption).
The other reduction is a bit more sophisticated:
BBReduction BASE(G,q,g,U,V) { W+ ← ASE(G,q,g,UV) W- ← ASE(G,q,g,U/V) output (W+/W-)1/4 }
The analysis of the reduction if far more subtle, because the reduction calls twice the attacker ASE, and it expects that the two calls result in right answers. Therefore, we need two facts:
The instance received in each call is properly distributed.
We can find a lower bound of the probability that in the two calls, both answers are correct.
The first condition clearly holds, because UV and U/V are uniformly distributed group elements, conditioned to the instance (G,q,g). The problem arises in the second requirement. In the low granularity setting, (G,q,g) are fixed and then we can show that UV and U/V are independent random variables. Therefore, the two outcomes corresponding to the calls giving the right answer are independent, and we are done. However, this approach does not directly work in the higher granularity levels, because both SE instances are correlated (they share the group and the generator). Hence we cannot claim independence in this case.
There are two ways to solve the problem: The first one is to randomize the second instance, replacing it by (G,q,gα,(U/V)α) and raising the the second answer to the inverse of α mod q. This recovers the independence in the medium granularity setting. The second solution is finding a lower bound of the probability that the two answers are correct. It can be shown that, in the worst case, this probability equals that of the independence case:
Lemma 2. Let Exp be a random experiment producing a random vector (X,Y), and let A be an event related to (X,Y), that is a predicate on the support of (X,Y). Consider now two independent realizations of the same experiment, producing vectors (X1,Y1) and (X2,Y2) and let A1 and A2 be the events that A respectively occurs in the first and the second realization of the experiment. Then
Prob[A1 and A2 | X1=X2] ≥ (Prob[A])2.
Proof. Define px=Prob[X=x] and ax=Prob[A | X=x], for any x in the support of X. Clearly, Prob[A] = ∑x∈Xaxpx and Prob[A1 and A2 | X1=X2] = ∑x∈X(ax)2px.
Consider now the dot product defined by the weight vector (px)x∈X, that is 〈u,v〉 = ∑x∈X pxuxvx.
Then, defining the vectors a=(ax)x∈X and 1=(1)x∈X, we can write
Prob[A] = 〈a,1〉 and
Prob[A1 and A2 | X1=X2] = 〈a,a〉.
Thus, by Cauchy-Schwartz inequality,
(Prob[A])2 = (〈a,1〉)2 ≤ 〈a,a〉〈1,1〉 = Prob[A1 and A2 | X1=X2].
This lemma solves the previous issues at all granularity levels, and it is a very useful tool when dealing with several calls to an attacker where their inputs can be split into a common part and an independent part.
There is still a problem with these reductions calling the attacker more than once: the advantage of the reduction normally depends on a power of the attacker's advantage. This means that the advantage of the reduction would be much less that the one of the attacker, to the point that can make the reduction meaningless in practice. This is the reason why people is interested mainly in tight reductions, where the two advantages and the two execution times are similar (e.g., they only differ in a small multiplicative constant).
The Random Oracle Model (ROM) was created as a formal model for secure hash functions, identifying them with truly random functions. This model is very convenient to build security proofs of very efficient and versatile cryptosystems. However, it is only an idealized model because the definition of a realization of a random function needs exponential space, and it cannot be handled by polynomial time algorithms. As a consequence, the so-called “security proofs” in the ROM are nothing more than heuristic security arguments, but they are usually interpreted as an evidence that any attack against a cryptosystem must use an exploit of the corresponding hash function. In this case, the attack could be fixed by just replacing the hash function by a more robust one. To be fair, for most encryption or signature schemes used in real applications only security proofs in the ROM are known.
Even when truly random functions cannot be used by efficient algorithms, they can be efficiently simulated in a security proof. In fact, the view that a polynomial time algorithm (e.g., an attacker) have on a random function consists only of polynomially many evaluations. Therefore, a reduction can easily maintaing a table of random values, giving a perfect simulation of the random function. Namely, a table containing pairs (x,H(x)) for a random function H : {0,1}* → Y, where Y is a finite set, can be simulated in the following way:
Start the simulation with an empty table.
For any new point x, check whether it is stored in the table.
If true, retrieve the known value of H(x) and output it.
Otherwise, choose a random element y∈Y and store the pair (x,y) in the table. Then, output y.
As an example, we show below how ROM can be used to prove the Hashed ElGamal security, IND-CPA(HElGamal), under the CDH Assuption, which is believed to be strictly weaker than DDH. We start a sequence of games with Game 0, that is identical to the IND-CPA game except that the hash function of Hashed ElGamal encryption is now modelled as a random oracle, simulated by the challenger.
Game 0((A0,A1),λ) {
t ← EmptyTable()
(G,q,g) ← I(λ)
x ←$ Zq
y ← gx
pk ← ((G,q,g),y)
(m0,m1,st) ← A0simH(pk)
r ←$ Zq
b ←$ {0,1}
c* ← (gr,simH(yr)⊕mb)
b' ← A1simH(c*,st)
if b'=b output 1
else output 0
endif
}
Oracle simH(w) {
if t.IsKeyInTable(w)
k ← t.GetValue(w)
else
k ←${0,1}λ
t.AddToTable(w,k)
endif
output k
}
Observe that the challenger also uses the simulated random oracle simH to build the challenge ciphertext. Otherwise, the simulation could be incorrect.
The next game, Game 1, is identical to the previous one from the point of view of the attacker. The main difference is the way the challenge ciphertext is generated.
Game 1((A0,A1),λ) {
t ← EmptyTable()
(G,q,g) ← I(λ)
x ←$ Zq
U ← gx
V ←$ G
W ← Vx
pk ← ((G,q,g),U)
(m0,m1,st) ← A0simH(pk)
b ←$ {0,1}
c* ← (V,simH(W)⊕mb)
b' ← A1simH(c*,st)
if b'=b output 1
else output 0
endif
}
Oracle simH(w) {
if t.IsKeyInTable(w)
k ← t.GetValue(w)
else
k ←${0,1}λ
t.AddToTable(w,k)
endif
output k
}
Then,
|AdvGame 1((A0,A1),λ) - AdvGame 0((A0,A1),λ)| = 0.
The next game uses a random binary string in the challenge ciphertext, not extracted from simH.
Game 2((A0,A1),λ) {
t ← EmptyTable()
(G,q,g) ← I(λ)
x ←$ Zq
U ← gx
V ←$ G
k* ←${0,1}λ
pk ← ((G,q,g),U)
(m0,m1,st) ← A0simH(pk)
b ←$ {0,1}
c* ← (V,k*⊕mb)
b' ← A1simH(c*,st)
if b'=b output 1
else output 0
endif
}
Oracle simH(w) {
if t.IsKeyInTable(w)
k ← t.GetValue(w)
else
k ←${0,1}λ
t.AddToTable(w,k)
endif
output k
}
Consider the event F1,2 that Vx is in the table t when the game ends, which is equivalent to say that at some point the attacker queried Vx to simH. Both games are identical if F1,2 does not occur (since in this case there is no contradiction in the simulation of simH). Then we can apply the Difference Lemma to show that
|AdvGame 2((A0,A1),λ) - AdvGame 1((A0,A1),λ)| ≤ Prob[F1,2].
However, in Game 2 the bit b is completely hidden to the attacker, because it is only used in the expression k*⊕mb. So b is completely masked by k*, that is also used only there. Indeed, the views of the attacker conditioned to b=0 and b=1 are identically distributed. As a consequence, Game 2 is trivial, and AdvGame 2((A0,A1),λ) = 0.
Summing up,
AdvIND-CPA(HElGamalROM,(A0,A1),λ) = AdvGame 0((A0,A1),λ) = AdvGame 1((A0,A1),λ) ≤ Prob[F1,2].
To finish the proof, we show a reduction that transforms the attacker (A0,A1) into another attacker B that solves a CDH problem instance with advantage at least a non-negligible fraction of Prob[F1,2]:
BBReduction B(A0,A1)(G,q,g,U,V) {
t ← EmptyTable()
k* ←${0,1}λ
pk ← ((G,q,g),U)
(m0,m1,st) ← A0simH(pk)
b ←$ {0,1}
c* ← (V,k*⊕mb)
b' ← A1simH(c*,st)
output t.GetARandomEntryValue
}
Oracle simH(w) {
if t.IsKeyInTable(w)
k ← t.GetValue(w)
else
k ←${0,1}λ
t.AddToTable(w,k)
endif
output k
}
The method t.GetARandomEntryValue selects at random one of the entries of the table t and outputs the stored value. The composition of the CDH game and the reduction B perfectly simulates Game 2 from the attacker's point of view. Notice that in Game 2, the secret key is only needed to generate the random group element U, that now is given as part of the CDH Problem instance.
Thus, conditioned to F1,2, the CDH Problem instance solution is one of the entries in the table, and then the reduction succeeds with a probability at least Prob[F1,2]/QH, where QH is the number of queries to simH made by the attacker. In conclusion, we have shown that
AdvIND-CPA(HElGamalROM,(A0,A1),λ) ≤ Prob[F1,2] ≤ QHAdvCDH(BA0,A1,λ).
This inequality ends the proof, because if CDH Assumption holds then AdvCDH(BA0,A1,λ) is negligible, as QH is bounded by a polynomial. Then, AdvIND-CPA(HElGamalROM,(A0,A1),λ) is also negligible for any attacker.
Notice that the proof is far from being tight, because in practice a real attacker can perform a considerable amount of hash computations (e.g., QH could be as high as 230), which is the loss factor of the reduction.
Another idealized analysis tool is the so-called Generic Group Model, in which the notion of group is abstracted to the point that an attacker can access the group functionality (e.g., neutral element, group operation and inverses) via some oracles. The goal is to analyze cryptosystems based on groups (like all cryptosystems related to the hardness of the Discrete Logarithm Problem) in a generic way, discarding any specific attack related to any particular representation of the group elements.
The abstract model behind the Generic Group Model is a map from a group G, say of prime order q, (used in the real algorithm) to the polynomial ring Zq[X1,...,Xn], where Xi is a formal variable that replaces one of the group elements in the input of the attacker's algorithm. For instance, if two group elements assigned to polynomials P1 and P2, are operated with the group operation, then the result is mapped to the polynomial P1+P2. The formal variables X1,...,Xn could also be considered as polynomials in other formal parameters T1,...,Td to make possible to encode in the polynomial ring some existing relations between the inputs of the attacker (e.g., if the inputs are taken from a DDH tuple).
This model is very limiting, but most of the known attacks to group-based cryptosystems are either avoidable by choosing a suitable group, or they fit well into the Generic Group Model. Therefore, this model can be seen as a formal tool that rules out any possible design mistake affecting the security of a group-based cryptosystem.
The main benefit of the model is translating a problem from complexity theory to a purely algebraic problem, and security proofs in the model often reduces to upper bounding the probability of finding polynomials of bounded degree vanishing at certain secret values.
In the previous sections we only addressed (positive) security results, where we show things like “If a problem family P is hard, then the cryptosystem S is secure in the security model M”. Other useful results in cryptography are the negative ones, which try to rule out the existence of some classes of reductions, like “there does not exist any black-box reduction in the Generic Group Model from problem A to problem B”.
The usual technique to obtain negative results is to build an explicit meta-reduction, that is, a transformation that converts any reduction (fulfilling some additional constraints) between two given problems into a successful attacker that solves a third problem (that can be the same as one of the other two).
Sometimes negative results are used to find lower bounds on the sizes of objects, the number of necessary queries or the loss factor of a reduction, for a restricted but still meaningful class of reductions (like black-box reductions, or key-preserving reductions). Typically, most known reductions fit into these restricted classes.