1 07-CryptoMath


“Few false ideas have more firmly gripped the minds of so many intelligent [people] than the one that,
if they just tried, they could invent a [convenient] cipher that no one could break.”

- David Kahn, in “The Codebreakers”

Discussion question:
What do you think underpins this sentiment?
Would you guess that in theory, a perfect, convenient cipher is a catch-22?
https://en.wikipedia.org/wiki/Catch-22_(logic)
E.g.,
“If all jobs require experience,
how can one get any experience,
until one gets a job that provides experience?”

1.1 Screencasts

1.2 Reading

Chapters 22-23
http://inventwithpython.com/cracking/

Read this brief review of crypto-related math:
https://en.wikibooks.org/wiki/Cryptography/Mathematical_Background

Read Appendix B on Modular arithmetic:
https://github.com/crypto101/crypto101.github.io/raw/master/Crypto101.pdf

1.3 Random numbers

Applied uses include generation of:

Keys for asymmetric public-key algorithms (not covered yet, up next).
A key to kick-start a pseudo-random stream key for symmetric stream cipher.
Symmetric key for use as a temporary session key.
Creating a digital envelope with as asymmetric keys.
Handshaking in crypto-systems to prevent replay attacks.
Session keys.

1.3.1 Random Number Requirements

https://www.2uo.de/myths-about-urandom/
(a good article)

1.3.1.1 Randomness criteria

Uniform distribution:
Frequency of occurrence of each of the numbers should be approximately the same (in the limit).

Independence:
Implies no one value in the sequence should be able to be inferred from the others.

1.3.1.2 Predictability

Each number should be statistically independent of other numbers in the sequence.
An opponent should not be able to predict future elements of the sequence,
on the basis of earlier elements,
event if a partner can, or should.

1.3.1.3 Random versus Pseudorandom

Cryptographic applications typically make use of algorithmic techniques for random number generation.
Pseudorandom algorithms are deterministic.
Therefore, they produce sequences of numbers that are not truly statistically random.

Pseudorandom numbers:
Satisfy statistical randomness tests.
Are possibly predictable (certainly given the seed and the algorithm).

True random number generator (TRNG):
Uses a non-deterministic (physical) source to produce randomness.
Most operate by measuring unpredictable natural physical processes,
e.g. radiation, gas discharge, leaky capacitors.
Increasingly provided on modern processors.

1.4 Logarithm and exponentiation review

https://en.wikipedia.org/wiki/Logarithm
Subtraction is the inverse of addition.
Division is the inverse of multiplication.
Logarithm is the inverse of exponentiation.
CS often variations of \(\log_2 x\) directly or indirectly.

Why?
CryptoMath/Logarithm_plots.png
We like to cut things in half!
We like to double things!

Generally:
\(\log_b x = y\)
\(b^y=x\)
\(b^{\log_b x}=x\)

Example:
\(\log_2 64 = 6\)
\(2^6=64\)
\(2^{\log_{2} 64}=64\)

The function:
\(\log_2 x\)
intersects x-axis at 1, and
passes through the points with coordinates (2, 1), (4, 2), and (8, 3), e.g.,
\(\log_2 8 = 3\)
\(2^3=8\)

1.4.1 Log cheatsheet

CryptoMath/logs.png
\(\log (nm) = \log n + \log m\)
\(\log (\frac{n}{m}) = \log n - \log m\)
\(\log(n^r) = r \log n\)
\(\log_a n = \frac{\log_b n}{\log_b a}\)
(base switch)

+++++++++++++++++++++
Cahoot-07.1

1.4.2 Modular exponentiation review

\(n^x \cdot n^y = n^{x+y}\)
\((n \cdot b)^x = n^x \cdot b^x\)
\(n^1 = n\)
\((n^x)^y = n^{x \cdot y}\)
Spend time on this last one!
How can we share a number publicly and produce a secret???
Demo an example in class.

All of the above equivalence statements are also true, mod p.
In particular, consider the last formula above,
where the below statements are all equivalent:
\((g^a \mod p)^b \mod p =\)
\(g^{ab} \mod p =\)
\(g^{ba} \mod p =\)
\((g^b \mod p)^a \mod p\)

https://en.wikipedia.org/wiki/Discrete_logarithm
Log’s equivalent operation for integers alone is the “discrete logarithm”.
We do this with modular arithmetic.

For example,
\(3^6 \equiv 9 \mod 15\)
\(\log_3 9 \equiv 6 \mod 15\)

Unlike modular inverses,
computing discrete logarithms is generally computationally hard.
It is similar to integer factorization.
There is no formal proof that computing discrete logarithms is intrinsically complex.
We just haven’t found any efficient algorithms to do it.
Given the ease going forward,
and the difficulty going “backward”,
we sometimes call this a one way function.

1.4.2.1 Recursive exponentiation

Recursive case:
bn = b * bn-1

Base case (termination):
b0 = 1
(or slightly faster: b1 = b)

Condition/test, which checks for the base:
if(n==0):

For example:

b4 =
b * b3 =
b * (b * b2) =
b * (b * (b * b1)) =
b * (b * (b * (b*b0)))

// Fully recursively in C++

double simpleExpRecur(int b, int n){
    double result = 1;
    cout << "Calling simpleExpRecur with b = "
         << b << " and n = " << n << endl;
    if(n == 0){
        return 1;
    }
    else if(n > 0){
        return b * simpleExpRecur(b, n - 1);
    }
    else{
        return 1 / simpleExpRecur(b, -n);
    }
    // can have return statements above or 1 here
}

1.4.2.2 Efficient exponentiation

https://en.wikipedia.org/wiki/Exponentiation_by_squaring

To obtain bn, do recursively:

if n is even,
    do b^(n//2) * b^(n//2)
if n is odd,
    do b * b^(n//2) * b^(n//2)
with base case
    b1 = b

Note:
n//2 is integer division

What is b62 ?

b62 = (b31)2
b31 = b(b15)2
b15 = b(b7)2
b7 = b(b3)2
b3 = b(b1)2
b1 = b

What is b61 ?

b61 = b(b30)2
b30 = (b15)2
b15 = b(b7)2
b7 = b(b3)2
b3 = b(b1)2
b1 = b

How many multiplications when counting from the bottom up?

// Efficiently in C++

long long expEff(const long long b, const int n){
    cout << "Calling expEff with b = "
         << b << " and n = " << n << endl;
    // this is a faster alternative to base case n==0 above, how much?
    if(n == 1){
        return b;
    }
    // bitwise faster than modulus to check even here
    else if((n & 1) == 0){
        return expEff(b * b, n / 2);
    }
    // odd
    else{
        return expEff(b * b, n / 2) * b;
    }
    // can have return statements above or 1 here
}

The algorithm is often used to generate RSA keys.

1.5 Basic Number Theories

We will review:
Greatest common divisor of integers.
Group and group operations.
Euler’s Theorem.

1.5.0.1 The division theorem among integers

If a and b are integers such that b > 0,
then there are unique integers q and r such that:
\(a = bq + r\)
where:
\(0 \le r < b\)

For example:
a = 17
b = 5
with q = 3 and r = 2
a = (3 * b) + 2

1.5.0.2 Linear combination

If a and b are integers,
the linear combination of a and b is a sum of the form ax + by,
where both x and y are integers.

What are some the linear combinations of 9x + 15y?
Among these combinations are:

-6 = 9 * 1    + 15 * (-1)
-3 = 9 * (-2) + 15 * 1
 0 = 9 * 0    + 15 * 0
 3 = 9 * 2    + 15 * (-1)
-6 = 9 * (-1) + 15 * 1

It can be shown that the all linear combinations of 9 and 15 include:
{…, -12, -9, -6, -3, 0, 3, 6, 9, 12, …}

1.5.0.3 The Greatest Common Divisor

The greatest common divisor (gcd) of two integers a and b, not both zero,
is the largest of the common divisors of a and b.
For example (note linear combination above):
gcd(9, 15) = 3

The greatest common divisor of the integers a and b, not both 0,
also happens to be the least positive integer that is a linear combination of a and b.
For example (note linear combination above):
gcd(9, 15) = 3

Basic facts related to GCD and divisibility.
These may aid in understanding some of the proofs to come:

Where:
x | y means that x divides y
The order of this | symbol the opposite of what you might think.

Fact 1:
d | a and d | b implies:
d | (am + bn)

Fact 2:
d | a and d | b implies:
d | gcd(a, b)

Fact 3:
gcd(0, 0) = 0 and
gcd(a, 0) = |a|

1.5.0.4 The Euclidean Algorithm

https://en.wikipedia.org/wiki/Euclidean_algorithm

For any non-negative integer a,
and any positive integer b,
gcd(a, b) = gcd(b, a mod b).

// Pseudocode:
Euclid(a, b)
    a and b are non-negative integers
    if b = 0 then
        return a
    else
        return Euclid(b, a mod b)
#!/usr/bin/python3
# -*- coding: utf-8 -*-

def gcd(a: int, b: int) -> int:
    """
    Euclid's Algorithm
    Return the Greatest Common Divisor of a and b
    """
    while a != 0:
        a, b = b % a, a
    return b

1.5.0.5 The Extended Euclidean Algorithm

https://en.wikipedia.org/wiki/Extended_Euclidean_algorithm

Suppose:
d = gcd(a, b).
The extended Euclidean algorithm finds the linear combinations of d,
in terms of a and b.
It is also used to find the modular multiplicative inverse of a number,
in a group mod n.
The algorithm is often used to generate RSA keys.

// Pseudocode

Extended_Euclid(a, b)
    a and b are non-negative integers, and / is floor division
    if b = 0 then
        return (a, 1, 0)
    else
        (g',  x',  y') = Extended_Euclid(b, a mod b)
        (g, x, y) = (g', y', x' - (a/b) y')
        return (g, x, y)
#!/usr/bin/python3
# -*- coding: utf-8 -*-

def findModInverse(a: int, m: int) -> Optional[int]:
    # Return the modular inverse of a % m, which is
    # the number x such that a*x % m = 1

    if gcd(a, m) != 1:
        # Ah dynamic typing!
        return None
        # No single mod inverse exists if a & m aren't relatively prime.

    # Calculate using the Extended Euclidean Algorithm:
    u1, u2, u3 = 1, 0, a
    v1, v2, v3 = 0, 1, m
    while v3 != 0:
        # Note that // is the integer division operator
        q = u3 // v3
        v1, v2, v3, u1, u2, u3 = (u1 - q * v1), (u2 - q * v2), (u3 - q * v3), v1, v2, v3
    return u1 % m

1.5.1 Groups

1.5.1.1 Basic Operations

A group G is a set of elements,
together with a binary operation (e.g., +),
such that:

  1. Closure under +:
    For every a and b in G,
    a + b is a unique element of G.

  2. Associative:
    For all a, b, and c in G,
    a + (b + c) = (a + b) + c

  3. Identity element:
    The set has a unique identity element e,
    for every element a of G,
    such that, a + e = e + a = a

  4. Inverse:
    Every a of G has a unique inverse a-1 in G,
    such that, a + a-1 = a-1 + a = e

1.5.1.2 Additive groups

\(Z^+_n\) = {0, 1, 3, …, n-1}
generally denotes an additive group,
with addition modulo n as the group operator.

Example:
Suppose n = 35
then \(Z^+_{35}\) = {0, 1, 2, …, 34}

Closure examples
If a = 7 and b = 13,
then a + b = 7 + 13 mod 35 = 20

If a = 17 and b = 33,
then a + b = 17 + 33 mod 35 = 15

Associative example
(3 + 11) + 27 % 35 = 6
3 + (11 + 27) % 35 = 6

Identity example
In \(Z^+_{35}\) the additive identity element is 0

Inverse example
In \(Z^+_{35}\) the inverse of 17 is 18.
(17 + 18) % 35 = 0
(18 + 17) % 35 = 0

1.5.1.3 Multiplicative groups

Group under multiplication of the invertible elements of a field, ring,
or other structure for which one of its operations is referred to as multiplication.

The multiplicative group of integers modulo n,
is the group under multiplication of the invertible elements Z/nZ
(those numbers lesser than,
n that are relatively prime to n,
with modular multiplicative inverses mod n).

When n is not prime,
there are elements other than zero that are not invertible.
The following generally denotes a multiplicative group,
with multiplication modulo n as the group operator:
\(Z^*_n = \{a| 0< a < n \land \gcd(a,n) = 1\}\)
The set of positive numbers,
less than n,
which have no shared factors with n.

Example:
Suppose n = 35, then
\(Z^*_{35} = {1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 16, 17, 18, 19, 22, 23, 24, 26, 27, 29, 31, 32, 33, 34}\)

Ask:
How many numbers are there?
Why are some missing?
What does this remind us of?

35 is NOT a prime number,
but n generally could be one…

Closure examples
If a = 8 and b = 13,
then a * b = 8 * 13 mod 35 = 34

If a = 17 and b = 33,
then a * b = 17 * 33 mod 35 = 1

Associative example
(3 * 11) * 27 % 35 = 16
3 * (11 * 27) % 35 = 16

Identity example
In \(Z^*_{35}\) the identity element is 1
9 * 1 = 9
3 * 1 = 3

Inverse example
In \(Z^*_{35}\) the inverse of 17 is 33.
(17 * 33) % 35 = 1
(33 * 17) % 35 = 1

Recall:
How to efficiently find the inverse of an element in a large multiplicative group?

1.5.1.4 Fermat’s little theorem

https://en.wikipedia.org/wiki/Fermat%27s_little_theorem

If p is a prime number, then for any integer a:
\(a^p \equiv a \pmod p\)
and also:
the number ap - a is an integer multiple of p,
(ap - a is equal to p times something)

For example, if a = 2 with modulus p = 7, then:
27 -2 = 126

27 = 128
128 // 7 = 18 with remainder 2
128 mod 7 = 2
7 × 18 = 126
which is an integer multiple of 7.

If a is not divisible by p,
then it is equivalent to stating that:
\(a^{p-1} \equiv 1 \pmod p\)
and ap-1 - 1 is an integer multiple of p,
(ap-1 - 1 is equal to p times something).

For example, if a = 2 and p = 7,
then 26 = 64,
and 7 × 9 = 64 - 1 = 63,
which is an integer multiple of 7.

Remember our trick with 1 in the multiplicative cipher?
Remember this for RSA!

+++++++++++++++++++++
Cahoot-07.2

1.5.2 Prime Numbers

“Mathematicians have tried in vain to this day to discover some order in the sequence of prime numbers,
and we have reason to believe that it is a mystery into which the human mind will never penetrate.”

- Leonhard Euler, 18th-century mathematician
https://en.wikipedia.org/wiki/Leonhard_Euler

Discussion question:
Do you have an intuition on this problem?
Must there be a pattern?
Might the sequence be fundamentally un-predictable?
What kind of computational methods might work to discover such a sequence?

Mention:
Student explorations into this.

1.5.2.1 Basic Definitions

A prime number is an integer greater than 1,
and is divisible only by 1 and itself.

A composite number is an integer greater than 1,
and is not a prime.

If gcd(a, b) = 1,
then two integers a and b are relatively prime.

For positive real numbers x,
let \(\pi(x)\) be defined as the number of prime numbers less than or equal to x.
What is this similar to above?

1.5.2.2 Important Theorems

1.5.2.2.1 Fundamental theorem of arithmetic

Every integer greater than 1 can be written as a product of primes,
and if the primes are written in non-decreasing order,
then this product of primes is unique.

1.5.2.2.2 The prime number theorem

There are infinitely many primes,
as demonstrated by Euclid around 300 BC.
No known simple formula separates prime numbers from composite numbers.
However, the distribution of primes within the natural numbers,
in the large limit, can be statistically modeled.
The probability of a randomly chosen number being prime,
is inversely proportional to its number of digits,
that is, to its logarithm.

The ratio of \(\pi(x)\) to \(x/ln(x)\) tends toward 1,
as x goes to infinity:
\(\lim_{x\rightarrow \infty} \frac{\pi(x)}{x/\ln x} = 1\)

1.5.2.3 Identifying Primes - Trial Division

https://en.wikipedia.org/wiki/Trial_division
Brute force from the bottom up:
CryptoMath/trial_div.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

import sys
from typing import List

def trial_division(n: int) -> List[int]:
    """Return a list of the prime factors for a natural number."""
    # Prepare an empty list.
    factors: List[int] = []
    # The first possible factor.
    possible_factor: int = 2
    # While n still has remaining factors...
    while n > 1:
        # The remainder of n divided by f might be zero.
        if n % possible_factor == 0:
            # If so, it divides n. Add f to the list.
            factors.append(possible_factor)
            # Divide that factor out of n.
            n //= possible_factor
        # But if f is not a factor of n,
        else:
            # Add one to f and try again.
            possible_factor += 1
    # Prime factors may be repeated: 12 factors to 2, 2, 3.
    return factors

if _name_ == "_main_":
    if len(sys.argv) == 2:
        print(trial_division(int(sys.argv[1])))
    else:
        print("Example trial division of 17: ", trial_division(17))

1.5.2.4 Identifying primes - Sieve of Eratosthenes

https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
The sieve of Eratosthenes is a simple, ancient algorithm,
for finding all prime numbers up to any given limit.
Mark as composite (i.e., not prime) the multiples of each prime,
starting with the first prime number, 2.

CryptoMath/siv2.jpeg

multiples of 2:
CryptoMath/siv0.jpeg

multiples of the next available number (prime 3)
CryptoMath/siv1.jpeg

CryptoMath/primeNum.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Prime Number Sieve
# https://www.nostarch.com/crackingcodes/ (BSD Licensed)

import math, random
from typing import List

def isPrimeTrialDiv(num: int) -> bool:
    # Returns True if num is a prime number, otherwise False.
    # Uses the trial division algorithm for testing primality.
    # All numbers less than 2 are not prime:
    if num < 2:
        return False
    # Is num divisible by any number up to the square root of num?
    for i in range(2, int(math.sqrt(num)) + 1):
        if num % i == 0:
            return False
    return True

def primeSieve(sieveSize: int) -> List[int]:
    # Returns a list of prime numbers calculated using
    # the Sieve of Eratosthenes algorithm.
    sieve = [True] * sieveSize
    # Zero and one are not prime numbers.
    sieve[0] = False
    sieve[1] = False
    # Create the sieve:
    for i in range(2, int(math.sqrt(sieveSize)) + 1):
        pointer = i * 2
        while pointer < sieveSize:
            sieve[pointer] = False
            pointer += i
    # Compile the list of primes:
    primes = []
    for i in range(sieveSize):
        if sieve[i] == True:
            primes.append(i)
    return primes

def rabinMiller(num: int) -> bool:
    # Returns True if num is a prime number.
    if num % 2 == 0 or num < 2:
        # Rabin-Miller doesn't work on even integers.
        return False
    if num == 3:
        return True
    s = num - 1
    t = 0
    while s % 2 == 0:
        # Keep halving s until it is odd (and use t
        # to count how many times we halve s):
        s = s // 2
        t += 1
    # Try to falsify num's primality 5 times.
    for trials in range(5):
        a = random.randrange(2, num - 1)
        v = pow(a, s, num)
        # (This test does not apply if v is 1.)
        if v != 1:
            i = 0
            while v != (num - 1):
                if i == t - 1:
                    return False
                else:
                    i = i + 1
                    v = (v**2) % num
    return True

# Most of the time we can quickly determine if num is not prime
# by dividing by the first few dozen prime numbers. This is quicker
# than rabinMiller(), but does not detect all composites.
LOW_PRIMES = primeSieve(100)

def isPrime(num: int) -> bool:
    # Return True if num is a prime number. This function does a quicker
    # prime number check before calling rabinMiller().
    if num < 2:
        # 0, 1, and negative numbers are not prime.
        return False
    # See if any of the low prime numbers can divide num:
    for prime in LOW_PRIMES:
        if num == prime:
            return True
        if num % prime == 0:
            return False
    # If all else fails, call rabinMiller() to determine if num is a prime:
    return rabinMiller(num)

def generateLargePrime(keysize: int = 1024) -> int:
    # Return a random prime number that is keysize bits in size:
    while True:
        num = random.randrange(2 ** (keysize - 1), 2 ** (keysize))
        if isPrime(num):
            return num

1.5.2.5 Euler’s theorem

https://en.wikipedia.org/wiki/Euler%27s_totient_function

Background and reminder:
Not every element of a complete residue system, modulo m,
has a modular multiplicative inverse,
for instance, zero never does.

After removing the elements of a complete residue system that are not relatively prime to m,
what is left is called a reduced residue system,
all of whose elements have modular multiplicative inverses.

The number of elements in a reduced residue system is \(\phi(m)\),
where \(\phi\) is Euler’s totient function,
i.e., the number of positive integers less than m,
that are relatively prime to m.

Euler’s totient function, \(\phi(n)\),
is the count of the positive integers up to a given integer n,
that are relatively prime to n.

https://en.wikipedia.org/wiki/Euler%27s_theorem
Let a and n be positive integers and relative primes.
and \(\phi(n)\) is Euler’s totient function, then:
\(a^{\phi(n)} \equiv 1 \pmod n\)

Further, there is a a special case of the totient.
Suppose we have two primes, p and q,
we take their product, n = p * q,
then is proven true that:
\(\phi(n) = (p-1)(q-1)\)

Suppose
n = 35
a = 17
\(\phi(35) = (5-1)(7-1) = 24\)
\(a^{\phi(n)} \mod n = 17^{\phi(35)} \mod 35 = 1\)

Ask:
This is a generalization of what above function?
Show them side by side.
Such a generalization is a big deal in math.

https://en.wikipedia.org/wiki/Fermat%27s_little_theorem#Generalizations

Remember our trick with 1 in the multiplicative cipher?
Remember this for RSA!

+++++++++++++++++++++
Cahoot-07.3

1.5.2.6 Fast and randomized primality testing

https://en.wikipedia.org/wiki/Primality_test

Euler’s theorem forms a foundation for identifying primes:

If a and n are relative prime, and
\(a^{\phi(n)} \mod n = 1\), then:
n is a probably a prime…

Miller-Rabin probabilistic primality test probably gives you the right answer:
https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test
If it says not-prime,
then it’s not prime.
If it says prime,
then there is a very, very, very small chance of it being composite.

1.5.3 Integer factorization

https://en.wikipedia.org/wiki/Integer_factorization
Integer factorization is the decomposition of a composite number,
into a product of smaller integers.
If these integers are further restricted to prime numbers,
then the process is called prime factorization.
When the numbers are sufficiently large, no efficient,
non-quantum integer factorization algorithm is known.
Related to discrete logarithm, how?

Briefly mention:
Memristive and quantum work on this upcoming in the next lecture!

1.5.4 Some algorithms are quantum resistant

https://www.ntru.org/f/hps98.pdf