07-CryptoMath

“Few false ideas have more firmly gripped the minds of so many intelligent [people] than the one that,
if they just tried, they could invent a [convenient] cipher that no one could break.”
- David Kahn, in “The Codebreakers”

Discussion question:
What do you think underpins this sentiment?
Would you guess that in theory, a perfect, convenient cipher is a catch-22?
https://en.wikipedia.org/wiki/Catch-22_(logic)
E.g.,
“If all jobs require experience,
how can one get any experience,
until one gets a job that provides experience?”

1.1 Screencasts

1.2 Reading

1.3 Random numbers

Keys for asymmetric public-key algorithms (not covered yet, up next).
A key to kick-start a pseudo-random stream key for symmetric stream cipher.
Symmetric key for use as a temporary session key.
Creating a digital envelope with as asymmetric keys.
Handshaking in crypto-systems to prevent replay attacks.
Session keys.

1.3.1 Random Number Requirements

1.3.1.1 Randomness criteria

Uniform distribution:
Frequency of occurrence of each of the numbers should be approximately the same (in the limit).

Independence:
Implies no one value in the sequence should be able to be inferred from the others.

1.3.1.2 Predictability

Each number should be statistically independent of other numbers in the sequence.
An opponent should not be able to predict future elements of the sequence,
on the basis of earlier elements,
event if a partner can, or should.

1.3.1.3 Random versus Pseudorandom

Cryptographic applications typically make use of algorithmic techniques for random number generation.
Pseudorandom algorithms are deterministic.
Therefore, they produce sequences of numbers that are not truly statistically random.

Pseudorandom numbers:
Satisfy statistical randomness tests.
Are possibly predictable (certainly given the seed and the algorithm).

True random number generator (TRNG):
Uses a non-deterministic (physical) source to produce randomness.
Most operate by measuring unpredictable natural physical processes,
e.g. radiation, gas discharge, leaky capacitors.
Increasingly provided on modern processors.

1.4 Logarithm and exponentiation review

https://en.wikipedia.org/wiki/Logarithm
Subtraction is the inverse of addition.
Division is the inverse of multiplication.
Logarithm is the inverse of exponentiation.
CS often variations of \(\log_2 x\) directly or indirectly.

The function:
\(\log_2 x\)
intersects x-axis at 1, and
passes through the points with coordinates (2, 1), (4, 2), and (8, 3), e.g.,
\(\log_2 8 = 3\)
\(2^3=8\)

1.4.1 Log cheatsheet

\(\log (nm) = \log n + \log m\)
\(\log (\frac{n}{m}) = \log n - \log m\)
\(\log(n^r) = r \log n\)
\(\log_a n = \frac{\log_b n}{\log_b a}\)
(base switch)

1.4.2 Modular exponentiation review

\(n^x \cdot n^y = n^{x+y}\)
\((n \cdot b)^x = n^x \cdot b^x\)
\(n^1 = n\)
\((n^x)^y = n^{x \cdot y}\)
Spend time on this last one!
How can we share a number publicly and produce a secret???
Demo an example in class.

All of the above equivalence statements are also true, mod p.
In particular, consider the last formula above,
where the below statements are all equivalent:
\((g^a \mod p)^b \mod p =\)
\(g^{ab} \mod p =\)
\(g^{ba} \mod p =\)
\((g^b \mod p)^a \mod p\)

Unlike modular inverses,
computing discrete logarithms is generally computationally hard.
It is similar to integer factorization.
There is no formal proof that computing discrete logarithms is intrinsically complex.
We just haven’t found any efficient algorithms to do it.
Given the ease going forward,
and the difficulty going “backward”,
we sometimes call this a one way function.

1.4.2.1 Recursive exponentiation

// Fully recursively in C++

double simpleExpRecur(int b, int n){
    double result = 1;
    cout << "Calling simpleExpRecur with b = "
         << b << " and n = " << n << endl;
    if(n == 0){
        return 1;
    }
    else if(n > 0){
        return b * simpleExpRecur(b, n - 1);
    }
    else{
        return 1 / simpleExpRecur(b, -n);
    }
    // can have return statements above or 1 here
}

1.4.2.2 Efficient exponentiation

b⁶² = (b³¹)²
b³¹ = b(b¹⁵)²
b¹⁵ = b(b⁷)²
b⁷ = b(b³)²
b³ = b(b¹)²
b¹ = b

b⁶¹ = b(b³⁰)²
b³⁰ = (b¹⁵)²
b¹⁵ = b(b⁷)²
b⁷ = b(b³)²
b³ = b(b¹)²
b¹ = b

// Efficiently in C++

long long expEff(const long long b, const int n){
    cout << "Calling expEff with b = "
         << b << " and n = " << n << endl;
    // this is a faster alternative to base case n==0 above, how much?
    if(n == 1){
        return b;
    }
    // bitwise faster than modulus to check even here
    else if((n & 1) == 0){
        return expEff(b * b, n / 2);
    }
    // odd
    else{
        return expEff(b * b, n / 2) * b;
    }
    // can have return statements above or 1 here
}

1.5 Basic Number Theories

We will review:
Greatest common divisor of integers.
Group and group operations.
Euler’s Theorem.

1.5.0.1 The division theorem among integers

If a and b are integers such that b > 0,
then there are unique integers q and r such that:
\(a = bq + r\)
where:
\(0 \le r < b\)

1.5.0.2 Linear combination

If a and b are integers,
the linear combination of a and b is a sum of the form ax + by,
where both x and y are integers.

What are some the linear combinations of 9x + 15y?
Among these combinations are:

It can be shown that the all linear combinations of 9 and 15 include:
{…, -12, -9, -6, -3, 0, 3, 6, 9, 12, …}

1.5.0.3 The Greatest Common Divisor

The greatest common divisor (gcd) of two integers a and b, not both zero,
is the largest of the common divisors of a and b.
For example (note linear combination above):
gcd(9, 15) = 3

The greatest common divisor of the integers a and b, not both 0,
also happens to be the least positive integer that is a linear combination of a and b.
For example (note linear combination above):
gcd(9, 15) = 3

Basic facts related to GCD and divisibility.
These may aid in understanding some of the proofs to come:

Where:
x | y means that x divides y
The order of this | symbol the opposite of what you might think.

1.5.0.4 The Euclidean Algorithm

For any non-negative integer a,
and any positive integer b,
gcd(a, b) = gcd(b, a mod b).

#!/usr/bin/python3
# -*- coding: utf-8 -*-

def gcd(a: int, b: int) -> int:
    """
    Euclid's Algorithm
    Return the Greatest Common Divisor of a and b
    """
    while a != 0:
        a, b = b % a, a
    return b

1.5.0.5 The Extended Euclidean Algorithm

Suppose:
d = gcd(a, b).
The extended Euclidean algorithm finds the linear combinations of d,
in terms of a and b.
It is also used to find the modular multiplicative inverse of a number,
in a group mod n.
The algorithm is often used to generate RSA keys.

#!/usr/bin/python3
# -*- coding: utf-8 -*-

def findModInverse(a: int, m: int) -> Optional[int]:
    # Return the modular inverse of a % m, which is
    # the number x such that a*x % m = 1

    if gcd(a, m) != 1:
        # Ah dynamic typing!
        return None
        # No single mod inverse exists if a & m aren't relatively prime.

    # Calculate using the Extended Euclidean Algorithm:
    u1, u2, u3 = 1, 0, a
    v1, v2, v3 = 0, 1, m
    while v3 != 0:
        # Note that // is the integer division operator
        q = u3 // v3
        v1, v2, v3, u1, u2, u3 = (u1 - q * v1), (u2 - q * v2), (u3 - q * v3), v1, v2, v3
    return u1 % m

1.5.1 Groups

1.5.1.1 Basic Operations

A group G is a set of elements,
together with a binary operation (e.g., +),
such that:

1.5.1.2 Additive groups

\(Z^+_n\) = {0, 1, 3, …, n-1}
generally denotes an additive group,
with addition modulo n as the group operator.

Inverse example
In \(Z^+_{35}\) the inverse of 17 is 18.
(17 + 18) % 35 = 0
(18 + 17) % 35 = 0

1.5.1.3 Multiplicative groups

Group under multiplication of the invertible elements of a field, ring,
or other structure for which one of its operations is referred to as multiplication.

The multiplicative group of integers modulo n,
is the group under multiplication of the invertible elements Z/nZ
(those numbers lesser than,
n that are relatively prime to n,
with modular multiplicative inverses mod n).

When n is not prime,
there are elements other than zero that are not invertible.
The following generally denotes a multiplicative group,
with multiplication modulo n as the group operator:
\(Z^*_n = \{a| 0< a < n \land \gcd(a,n) = 1\}\)
The set of positive numbers,
less than n,
which have no shared factors with n.

Example:
Suppose n = 35, then
\(Z^*_{35} = {1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 16, 17, 18, 19, 22, 23, 24, 26, 27, 29, 31, 32, 33, 34}\)

Ask:
How many numbers are there?
Why are some missing?
What does this remind us of?

Identity example
In \(Z^*_{35}\) the identity element is 1
9 * 1 = 9
3 * 1 = 3

Inverse example
In \(Z^*_{35}\) the inverse of 17 is 33.
(17 * 33) % 35 = 1
(33 * 17) % 35 = 1

Recall:
How to efficiently find the inverse of an element in a large multiplicative group?

1.5.1.4 Fermat’s little theorem

If p is a prime number, then for any integer a:
\(a^p \equiv a \pmod p\)
and also:
the number a^p - a is an integer multiple of p,
(a^p - a is equal to p times something)

2⁷ = 128
128 // 7 = 18 with remainder 2
128 mod 7 = 2
7 × 18 = 126
which is an integer multiple of 7.

If a is not divisible by p,
then it is equivalent to stating that:
\(a^{p-1} \equiv 1 \pmod p\)
and a^p-1 - 1 is an integer multiple of p,
(a^p-1 - 1 is equal to p times something).

For example, if a = 2 and p = 7,
then 2⁶ = 64,
and 7 × 9 = 64 - 1 = 63,
which is an integer multiple of 7.

1.5.2 Prime Numbers

“Mathematicians have tried in vain to this day to discover some order in the sequence of prime numbers,
and we have reason to believe that it is a mystery into which the human mind will never penetrate.”
- Leonhard Euler, 18th-century mathematician
https://en.wikipedia.org/wiki/Leonhard_Euler

Discussion question:
Do you have an intuition on this problem?
Must there be a pattern?
Might the sequence be fundamentally un-predictable?
What kind of computational methods might work to discover such a sequence?

1.5.2.1 Basic Definitions

A prime number is an integer greater than 1,
and is divisible only by 1 and itself.

For positive real numbers x,
let \(\pi(x)\) be defined as the number of prime numbers less than or equal to x.
What is this similar to above?

1.5.2.2 Important Theorems

1.5.2.2.1 Fundamental theorem of arithmetic

Every integer greater than 1 can be written as a product of primes,
and if the primes are written in non-decreasing order,
then this product of primes is unique.

1.5.2.2.2 The prime number theorem

There are infinitely many primes,
as demonstrated by Euclid around 300 BC.
No known simple formula separates prime numbers from composite numbers.
However, the distribution of primes within the natural numbers,
in the large limit, can be statistically modeled.
The probability of a randomly chosen number being prime,
is inversely proportional to its number of digits,
that is, to its logarithm.

The ratio of \(\pi(x)\) to \(x/ln(x)\) tends toward 1,
as x goes to infinity:
\(\lim_{x\rightarrow \infty} \frac{\pi(x)}{x/\ln x} = 1\)

1.5.2.3 Identifying Primes - Trial Division

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import sys
from typing import List

def trial_division(n: int) -> List[int]:
    """Return a list of the prime factors for a natural number."""
    # Prepare an empty list.
    factors: List[int] = []
    # The first possible factor.
    possible_factor: int = 2
    # While n still has remaining factors...
    while n > 1:
        # The remainder of n divided by f might be zero.
        if n % possible_factor == 0:
            # If so, it divides n. Add f to the list.
            factors.append(possible_factor)
            # Divide that factor out of n.
            n //= possible_factor
        # But if f is not a factor of n,
        else:
            # Add one to f and try again.
            possible_factor += 1
    # Prime factors may be repeated: 12 factors to 2, 2, 3.
    return factors

if _name_ == "_main_":
    if len(sys.argv) == 2:
        print(trial_division(int(sys.argv[1])))
    else:
        print("Example trial division of 17: ", trial_division(17))

1.5.2.4 Identifying primes - Sieve of Eratosthenes

https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
The sieve of Eratosthenes is a simple, ancient algorithm,
for finding all prime numbers up to any given limit.
Mark as composite (i.e., not prime) the multiples of each prime,
starting with the first prime number, 2.

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Prime Number Sieve
# https://www.nostarch.com/crackingcodes/ (BSD Licensed)

import math, random
from typing import List

def isPrimeTrialDiv(num: int) -> bool:
    # Returns True if num is a prime number, otherwise False.
    # Uses the trial division algorithm for testing primality.
    # All numbers less than 2 are not prime:
    if num < 2:
        return False
    # Is num divisible by any number up to the square root of num?
    for i in range(2, int(math.sqrt(num)) + 1):
        if num % i == 0:
            return False
    return True

def primeSieve(sieveSize: int) -> List[int]:
    # Returns a list of prime numbers calculated using
    # the Sieve of Eratosthenes algorithm.
    sieve = [True] * sieveSize
    # Zero and one are not prime numbers.
    sieve[0] = False
    sieve[1] = False
    # Create the sieve:
    for i in range(2, int(math.sqrt(sieveSize)) + 1):
        pointer = i * 2
        while pointer < sieveSize:
            sieve[pointer] = False
            pointer += i
    # Compile the list of primes:
    primes = []
    for i in range(sieveSize):
        if sieve[i] == True:
            primes.append(i)
    return primes

def rabinMiller(num: int) -> bool:
    # Returns True if num is a prime number.
    if num % 2 == 0 or num < 2:
        # Rabin-Miller doesn't work on even integers.
        return False
    if num == 3:
        return True
    s = num - 1
    t = 0
    while s % 2 == 0:
        # Keep halving s until it is odd (and use t
        # to count how many times we halve s):
        s = s // 2
        t += 1
    # Try to falsify num's primality 5 times.
    for trials in range(5):
        a = random.randrange(2, num - 1)
        v = pow(a, s, num)
        # (This test does not apply if v is 1.)
        if v != 1:
            i = 0
            while v != (num - 1):
                if i == t - 1:
                    return False
                else:
                    i = i + 1
                    v = (v**2) % num
    return True

# Most of the time we can quickly determine if num is not prime
# by dividing by the first few dozen prime numbers. This is quicker
# than rabinMiller(), but does not detect all composites.
LOW_PRIMES = primeSieve(100)

def isPrime(num: int) -> bool:
    # Return True if num is a prime number. This function does a quicker
    # prime number check before calling rabinMiller().
    if num < 2:
        # 0, 1, and negative numbers are not prime.
        return False
    # See if any of the low prime numbers can divide num:
    for prime in LOW_PRIMES:
        if num == prime:
            return True
        if num % prime == 0:
            return False
    # If all else fails, call rabinMiller() to determine if num is a prime:
    return rabinMiller(num)

def generateLargePrime(keysize: int = 1024) -> int:
    # Return a random prime number that is keysize bits in size:
    while True:
        num = random.randrange(2 ** (keysize - 1), 2 ** (keysize))
        if isPrime(num):
            return num

1.5.2.5 Euler’s theorem

Background and reminder:
Not every element of a complete residue system, modulo m,
has a modular multiplicative inverse,
for instance, zero never does.

After removing the elements of a complete residue system that are not relatively prime to m,
what is left is called a reduced residue system,
all of whose elements have modular multiplicative inverses.

The number of elements in a reduced residue system is \(\phi(m)\),
where \(\phi\) is Euler’s totient function,
i.e., the number of positive integers less than m,
that are relatively prime to m.

Euler’s totient function, \(\phi(n)\),
is the count of the positive integers up to a given integer n,
that are relatively prime to n.

Further, there is a a special case of the totient.
Suppose we have two primes, p and q,
we take their product, n = p * q,
then is proven true that:
\(\phi(n) = (p-1)(q-1)\)

Suppose
n = 35
a = 17
\(\phi(35) = (5-1)(7-1) = 24\)
\(a^{\phi(n)} \mod n = 17^{\phi(35)} \mod 35 = 1\)

Ask:
This is a generalization of what above function?
Show them side by side.
Such a generalization is a big deal in math.

1.5.2.6 Fast and randomized primality testing

If a and n are relative prime, and
\(a^{\phi(n)} \mod n = 1\), then:
n is a probably a prime…

Miller-Rabin probabilistic primality test probably gives you the right answer:
https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test
If it says not-prime,
then it’s not prime.
If it says prime,
then there is a very, very, very small chance of it being composite.

1.5.3 Integer factorization

https://en.wikipedia.org/wiki/Integer_factorization
Integer factorization is the decomposition of a composite number,
into a product of smaller integers.
If these integers are further restricted to prime numbers,
then the process is called prime factorization.
When the numbers are sufficiently large, no efficient,
non-quantum integer factorization algorithm is known.
Related to discrete logarithm, how?

Briefly mention:
Memristive and quantum work on this upcoming in the next lecture!

1 07-CryptoMath