1 03-TranspositionCiphers


Previous: 02-IntroCryptoCaesar.html

Q: How can you tell the difference between an encrypted joke and a random string of characters?
A: You can’t. They’re indistinguishable without the key.

1.1 Screencasts

1.2 Reading

http://inventwithpython.com/cracking/ Chapters 7-12

1.3 Transposition cipher basics

https://en.wikipedia.org/wiki/Transposition_cipher
A transposition cipher is a method of encryption,
by which the positions held by units of plaintext are shifted according to a regular system,
so that the ciphertext constitutes a permutation of the plaintext.
Plaintext units being shifted are commonly characters or groups of characters.
The order of the units is changed (the plaintext is reordered).
Some variations of this algorithm can actually be quite strong and secure.
It comes in many variations:
Rail Fence cipher (zig-zag)
Route cipher
Columnar transposition (today)
Double transposition
Myszkowski transposition
Disrupted transposition
Scytale (this wood and leather thing below)
03-TranspositionCiphers/398px-Skytale.png
Grilles (masking, like the matrix below)
03-TranspositionCiphers/Tangiers1.png03-TranspositionCiphers/Tangiers2.png
In “the old days” a spy could embed a message in a newspaper,
and apply a mask to it!

Important note:
This, like the other basic algorithms we are covering,
might seem like archaic toys.
However, the principles and direct matrix operations resemble those employed in AES,
and other strong, valid, modern encryption algorithms,
which employ substitution and transformation in similar ways to this algorithm!
Further, variations on this masking paradigm could be exceptionally strong,
because the set of masks in the human world is so huge.

1.3.1 Columnar encryption

A key is any secret used as input to an encryption algorithm,
with the plaintext message, to produce the ciphertext.

In this case, a key of 8 defines a row length of 8:
03-TranspositionCiphers/00066.jpeg
Write in the message, with spaces, wrapping into a matrix that is K wide (8 here):
03-TranspositionCiphers/00067.jpeg
Common sense is not so common.
03-TranspositionCiphers/00068.jpeg
… and ignore the empty boxes at the end.

To formalize for computation, number the positions, starting at 0.
Notice a theme in formalizing?
03-TranspositionCiphers/00069.jpeg
To encrypt, read column-wise, rather than row-wise:
03-TranspositionCiphers/00070.jpeg
Cenoonommstmme oo snnio. s s c
03-TranspositionCiphers/00071.jpeg
How to structure this transformation with a loop?
Which loop, out or inner, correspond to each of the pictures below?

Inner or outer loop?
03-TranspositionCiphers/00073.jpeg
Inner our outer loop?
03-TranspositionCiphers/00074.jpeg
Could this have been coded differently?

1.3.1.1 Transposition encryption code

Trace transpositionEncrypt.py
03-TranspositionCiphers/transpositionEncrypt.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Encryption
# https://www.nostarch.com/crackingcodes (BSD Licensed)

def main() -> None:
    myMessage = "Common sense is not so common."
    myKey = 8
    ciphertext = encryptMessage(myKey, myMessage)
    # Print the encrypted string in ciphertext to the screen,
    # with a | ("pipe" character) after it,
    # in case there are spaces at # the end of the encrypted message.
    print(ciphertext + "|")

def encryptMessage(key: int, message: str) -> str:
    # Each string in ciphertext represents a column in the grid.
    ciphertext = [""] * key
    # Loop through each column in ciphertext.
    for column in range(key):
        currentIndex = column
        # Keep looping until currentIndex goes past the message length.
        while currentIndex < len(message):
            # Place the character at currentIndex in message at the
            # end of the current column in the ciphertext list.
            ciphertext[column] += message[currentIndex]
            # move currentIndex over
            currentIndex += key
    # Convert the ciphertext list into a single string value and return it.
    return "".join(ciphertext)

# If transpositionEncrypt.py is run (instead of imported as a module) call
# the main() function.
if _name_ == "_main_":
    main()
03-TranspositionCiphers/transpositionEncrypt_cfg.svg

++++++++++++++++++++
Cahoot-03.1

1.3.2 Columnar decryption

To find the number of columns for a particular message,
divide the length of the ciphertext message by the key,
and if the result isn’t already a whole number,
then round up to the nearest whole number.
The length of the ciphertext is 30 characters
(the same as the plaintext) and the key is 8,
so 30 divided by 8 is 3.75, rounded up to 4.
03-TranspositionCiphers/00000.jpeg
How to loop through the below structure?
Does the operation in code resemble encryption?
Is it identical?
03-TranspositionCiphers/00075.jpeg

1.3.2.1 Decryption code

Trace transpositionDecrypt.py
03-TranspositionCiphers/transpositionDecrypt.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Decryption
# https://www.nostarch.com/crackingcodes (BSD Licensed)

import math

def main() -> None:
    myMessage = "Cenoonommstmme oo snnio. s s c"
    myKey = 8
    plaintext = decryptMessage(myKey, myMessage)
    # Print with a | ("pipe" character) after it in case
    # there are spaces at the end of the decrypted message.
    print(plaintext + "|")

def decryptMessage(key: int, message: str) -> str:
    # The transposition decrypt function will simulate the "columns" and
    # "rows" of the grid that the plaintext is written on by using a list
    # of strings. First, we need to calculate a few values.
    # The number of "columns" in our transposition grid:
    numOfColumns = int(math.ceil(len(message) / float(key)))
    # The number of "rows" in our grid will need:
    numOfRows = key
    # The number of "shaded boxes" in the last "column" of the grid:
    numOfShadedBoxes = (numOfColumns * numOfRows) - len(message)
    # Each string in plaintext represents a column in the grid.
    plaintext = [""] * numOfColumns
    # The column and row variables point to where in the grid the next
    # character in the encrypted message will go.
    column = 0
    row = 0
    for symbol in message:
        plaintext[column] += symbol
        column += 1  # Point to next column.
        # If there are no more columns OR we're at a shaded box, go back to
        # the first column and the next row:
        if (column == numOfColumns) or (
            column == numOfColumns - 1 and row >= numOfRows - numOfShadedBoxes
        ):
            column = 0
            row += 1
    return "".join(plaintext)

# If transpositionDecrypt.py is run (instead of imported as a module) call
# the main() function.
if _name_ == "_main_":
    main()
03-TranspositionCiphers/transpositionDecrypt_cfg.svg

++++++++++++++++++++
Cahoot-03.2

1.4 Formal specs

X: 1, 2, 3… 26 (the symbol set)
Y: 1, 2, 3… 26 (the symbol set)
K: ~ message length / 2
E: ?
D: ?

Recall:
In general, a key is just a secret, which can take many forms, as input to the encryption function.

1.5 Code to test code

Meta-code is actually quite fun to write, for example:
unit tests, autograders, and crypto-cracks.

Trace transpositionTest.py
03-TranspositionCiphers/transpositionTest.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Test
# https://www.nostarch.com/crackingcodes (BSD Licensed)

import random, sys, transpositionEncrypt, transpositionDecrypt

def main() -> None:
    random.seed(42)  # set the random "seed" to a static value
    for i in range(20):  # run 20 tests
        # Generate random messages to test.
        # The message will have a random length:
        message = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" * random.randint(4, 40)
        # Convert the message string to a list to shuffle it.
        message_array = list(message)
        random.shuffle(message_array)
        message = "".join(message_array)  # convert list to string
        print('Test #%s: "%s..."' % (i + 1, message[:50]))
        # Check all possible keys for each message.
        for key in range(1, int(len(message) / 2)):
            encrypted = transpositionEncrypt.encryptMessage(key, message)
            print(encrypted)
            decrypted = transpositionDecrypt.decryptMessage(key, encrypted)
            # If the decryption doesn't match the original message, display
            # an error message and quit.
            if message != decrypted:
                print("Mismatch with key %s and message %s." % (key, message))
                print("Decrypted as: " + decrypted)
                sys.exit()
    print("Transposition cipher test passed.")

# If transpositionTest.py is run (instead of imported as a module) call
# the main() function.
if _name_ == "_main_":
    main()

03-TranspositionCiphers/transpositionTest_cfg.svg
What did we randomize, and parametrically test across?
Message content?
Message length?
What about strange characters?
Any other strange edge cases we should include?
How would you write a unit test for the encryption and decryption functions?
Unit testing on messy steroids: define fuzzing!

1.6 Detecting English

Check out detectEnglish.py and dictionary.txt
03-TranspositionCiphers/detectEnglish.py
03-TranspositionCiphers/dictionary.txt

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Detect English module
"""
https://www.nostarch.com/crackingcodes (BSD Licensed)
To use, type this code:
>>> import detectEnglish
>>> detectEnglish.isEnglish(someString) # returns True or False
(There must be a "dictionary.txt" file in this directory with all English
words in it, one word per line. You can download this from
https://invpy.com/dictionary.txt)
"""

from typing import Dict

UPPERLETTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
LETTERS_AND_SPACE = UPPERLETTERS + UPPERLETTERS.lower() + " \t\n"

# Why use a dictionary here instead of a list?
# What would have been a better data structure, and less of a hack?
def loadDictionary() -> Dict[str, None]:
    dictionaryFile = open("dictionary.txt")
    # should have been a set...
    englishWords: Dict[str, None] = {}
    for word in dictionaryFile.read().split("\n"):
        englishWords[word] = None
    dictionaryFile.close()
    return englishWords

ENGLISH_WORDS = loadDictionary()

def getEnglishCount(message: str) -> float:
    message = message.upper()
    message = removeNonLetters(message)
    possibleWords = message.split()
    if possibleWords == []:
        return 0.0  # No words at all, so return 0.0.
    matches = 0
    for word in possibleWords:
        if word in ENGLISH_WORDS:
            matches += 1
    return float(matches) / len(possibleWords)

def removeNonLetters(message: str) -> str:
    lettersOnly = []
    for symbol in message:
        if symbol in LETTERS_AND_SPACE:
            lettersOnly.append(symbol)
    return "".join(lettersOnly)

def isEnglish(
    message: str, wordPercentage: int = 20, letterPercentage: int = 85
) -> bool:
    # By default, 20% of the words must exist in the dictionary file, and
    # 85% of all the characters in the message must be letters or spaces
    # (not punctuation or numbers).
    wordsMatch = getEnglishCount(message) * 100 >= wordPercentage
    numLetters = len(removeNonLetters(message))
    messageLettersPercentage = float(numLetters) / len(message) * 100
    lettersMatch = messageLettersPercentage >= letterPercentage
    return wordsMatch and lettersMatch

03-TranspositionCiphers/detectEnglish_cfg.svg
Mention:
Which function do we actually call when we import this?
Are the others potentially independent,
such that we could call them too?
Does any simple newline separated dictionary work?
What kind of encoding(s) does Python3 require in it’s text files?
Are there better ways to pick the best English candidate?
Does this always work?
What if nothing matches?

++++++++++++++++++++
Cahoot-03.3

What if the key was 1-long?

++++++++++++++++++++
Discussion question:
Can you think of any reasons why merely detecting English sufficient to know when you have cracked a ciphertext?
How might you hide a message, despite having the cryptography having been cracked?

1.7 Cracking the transposition cipher

Check out transpositionHacker.py
03-TranspositionCiphers/transpositionHacker.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Hacker
# https://www.nostarch.com/crackingcodes (BSD Licensed)

import detectEnglish, transpositionDecrypt
from typing import Optional

def main() -> None:
    # You might want to copy & paste this text from the source code at
    # https://invpy.com/transpositionHacker.py
    myMessage = """AaKoosoeDe5 b5sn ma reno ora'lhlrrceey e  enlh na  \
indeit n uhoretrm au ieu v er Ne2 gmanw,forwnlbsya apor tE.no euarisfa\
tt  e mealefedhsppmgAnlnoe(c -or)alat r lw o eb  nglom,Ain one dtes il\
hetcdba. t tg eturmudg,tfl1e1 v  nitiaicynhrCsaemie-sp ncgHt nie cetrg\
mnoa yc r,ieaa  toesa- e a0m82e1w shcnth  ekh gaecnpeutaaieetgn iodhso\
 d ro hAe snrsfcegrt NCsLc b17m8aEheideikfr aBercaeu thllnrshicwsg etr\
iebruaisss  d iorr."""
    hackedMessage = hackTransposition(myMessage)
    if hackedMessage == None:
        print("Failed to hack encryption.")
    else:
        print("Copying hacked message to clipboard:")
        print(hackedMessage)

def hackTransposition(message: str) -> Optional[str]:
    print("Hacking...")
    # Python programs can be stopped at any time by pressing Ctrl-C (on
    # Windows) or Ctrl-D (on Mac and Linux)
    print("(Press Ctrl-C or Ctrl-D to quit at any time.)")
    # Brute-force by looping through every possible key.
    for key in range(1, len(message)):
        print("Trying key #%s..." % (key))
        decryptedText = transpositionDecrypt.decryptMessage(key, message)
        if detectEnglish.isEnglish(decryptedText):
            # Ask user if this is the correct decryption.
            print()
            print("Possible encryption hack:")
            print("Key %s: %s" % (key, decryptedText[:100]))
            print()
            print("Enter D if done, anything else to continue hacking:")
            response = input("> ")
            if response.strip().upper().startswith("D"):
                return decryptedText
    return None

if _name_ == "_main_":
    main()
03-TranspositionCiphers/transpositionHacker_cfg.svg

1.8 Dealing with files

On your own time, it’d be worth tracing these, for the experience of dealing with larger files to read in and out when encrypting/decrypting/cracking:
03-TranspositionCiphers/transpositionFileCipher.py
03-TranspositionCiphers/transpositionFileHacker.py
03-TranspositionCiphers/frankenstein.txt
03-TranspositionCiphers/frankenstein.encrypted.txt
03-TranspositionCiphers/frankenstein.decrypted.txt

Next: 04-AffineCipher.html