1 TranspositionCiphers


Q: How can you tell the difference between an encrypted joke and a random string of characters?
A: You can’t. They’re indistinguishable without the key.

1.1 Screencasts

1.2 Reading

http://inventwithpython.com/cracking/ Chapters 7-12

1.3 Transposition cipher basics

https://en.wikipedia.org/wiki/Transposition_cipher
A transposition cipher is a method of encryption,
by which the positions held by units of plaintext are shifted according to a regular system,
so that the ciphertext constitutes a permutation of the plaintext.
Plaintext units being shifted are commonly characters or groups of characters.
The order of the units is changed (the plaintext is reordered).
Some variations of this algorithm can actually be quite strong and secure.

It comes in many variations:
Rail Fence cipher (zig-zag)
Route cipher
Columnar transposition (today)
Double transposition
Myszkowski transposition
Disrupted transposition
Scytale (this wood and leather thing below)
TranspositionCiphers/398px-Skytale.png

Grilles (masking, like the matrix below)
TranspositionCiphers/Tangiers1.pngTranspositionCiphers/Tangiers2.png
In “the old days” a spy could embed a message in a newspaper,
and apply a mask to it!

Important note:
This, like the other basic algorithms we are covering,
might seem like archaic toys.
However, the principles and direct matrix operations resemble those employed in AES,
and other strong, valid, modern encryption algorithms,
which employ substitution and transformation in similar ways to this algorithm!
Further, variations on this masking paradigm could be exceptionally strong,
because the set of masks in the human world is so huge.

1.3.1 Columnar encryption

A key is any secret used as input to an encryption algorithm,
with the plaintext message, to produce the ciphertext.

In this case, a key of 8 defines a row length of 8:
TranspositionCiphers/00066.jpeg
Write in the message, with spaces, wrapping into a matrix that is K wide (8 here):
TranspositionCiphers/00067.jpeg
Common sense is not so common.
TranspositionCiphers/00068.jpeg
… and ignore the empty boxes at the end.

To formalize for computation, number the positions, starting at 0.
Notice a theme in formalizing?
TranspositionCiphers/00069.jpeg
To encrypt, read column-wise, rather than row-wise:
TranspositionCiphers/00070.jpeg
Cenoonommstmme oo snnio. s s c
TranspositionCiphers/00071.jpeg
How to structure this transformation with a loop?
Which loop, out or inner, correspond to each of the pictures below?

Inner or outer loop?
TranspositionCiphers/00073.jpeg
Inner our outer loop?
TranspositionCiphers/00074.jpeg
Could this have been coded differently?

1.3.1.1 Transposition encryption code

Trace transpositionEncrypt.py
TranspositionCiphers/transpositionEncrypt.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Encryption
# https://www.nostarch.com/crackingcodes (BSD Licensed)


def main() -> None:
    myMessage = "Common sense is not so common."
    myKey = 8
    ciphertext = encryptMessage(myKey, myMessage)
    # Print the encrypted string in ciphertext to the screen,
    # with a | ("pipe" character) after it,
    # in case there are spaces at # the end of the encrypted message.
    print(ciphertext)


def encryptMessage(key: int, message: str) -> str:
    # Each string in ciphertext represents a column in the grid.
    ciphertext = [""] * key
    # Loop through each column in ciphertext.
    for column in range(key):
        currentIndex = column
        # Keep looping until currentIndex goes past the message length.
        while currentIndex < len(message):
            # Place the character at currentIndex in message at the
            # end of the current column in the ciphertext list.
            ciphertext[column] += message[currentIndex]
            # move currentIndex over
            currentIndex += key
    # Convert the ciphertext list into a single string value and return it.
    return "".join(ciphertext)


# If transpositionEncrypt.py is run (instead of imported as a module) call
# the main() function.
if __name__ == "__main__":
    main()
TranspositionCiphers/transpositionEncrypt_cfg.svg

A version in Haskell:
TranspositionCiphers/TranspositionEncrypt.hs

#!/usr/bin/runghc

module TranspositionEncrypt (encryptString, messagePad, numOfRows) where

import Data.List
import Data.List.Split

-- | Demonstrate use of chunksOf.
-- >>> chunksOf 8 "Common sense is not so common."
-- ["Common s","ense is ","not so c","ommon."]

-- |
-- >>> numOfRows 8 "Common sense is not so common."
-- 4
numOfRows :: Int -> String -> Int
numOfRows key message = ceiling (genericLength message / fromIntegral key)

-- |
-- >>> lengthPad 8 "Common sense is not so common."
-- 2
lengthPad :: Int -> String -> Int
lengthPad key message = (numOfRows key message * key) - length message

-- |
-- >>> messagePad 8 "Common sense is not so common."
-- "Common sense is not so common.  "
messagePad :: Int -> String -> String
messagePad key message =
  let padding = replicate (lengthPad key message) ' '
   in message ++ padding

-- | Enrcyption function.
-- >>> encryptString 8 "Common sense is not so common."
-- "Cenoonommstmme oo snnio. s  s c "
encryptString :: Int -> String -> String
encryptString key = intercalate "" . transpose . chunksOf key . messagePad key

main :: IO ()
main = do
  let message = "Common sense is not so common."
  let key = 8
  putStrLn "Your encrypted message is:"
  putStrLn (encryptString key message)

+++++++++++++++++++++++++++++++++++++
Cahoot_TranspositionCipher-Dimensions

1.3.2 Columnar decryption

To find the number of columns for a particular message,
divide the length of the ciphertext message by the key,
and if the result isn’t already a whole number,
then round up to the nearest whole number.
The length of the ciphertext is 30 characters
(the same as the plaintext) and the key is 8,
so 30 divided by 8 is 3.75, rounded up to 4.
TranspositionCiphers/00000.jpeg
How to loop through the below structure?
Does the operation in code resemble encryption?
Is it identical?
TranspositionCiphers/00075.jpeg

1.3.2.1 Decryption code

Trace transpositionDecrypt.py
TranspositionCiphers/transpositionDecrypt.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Decryption
# https://www.nostarch.com/crackingcodes (BSD Licensed)

import math


def main() -> None:
    myMessage = "Cenoonommstmme oo snnio. s s c"
    myKey = 8
    plaintext = decryptMessage(myKey, myMessage)
    # Print with a | ("pipe" character) after it in case
    # there are spaces at the end of the decrypted message.
    print(plaintext)


def decryptMessage(key: int, message: str) -> str:
    # The transposition decrypt function will simulate the "columns" and
    # "rows" of the grid that the plaintext is written on by using a list
    # of strings. First, we need to calculate a few values.
    # The number of "columns" in our transposition grid:
    numOfColumns = int(math.ceil(len(message) / float(key)))
    # The number of "rows" in our grid will need:
    numOfRows = key
    # The number of "shaded boxes" in the last "column" of the grid:
    numOfShadedBoxes = (numOfColumns * numOfRows) - len(message)
    # Each string in plaintext represents a column in the grid.
    plaintext = [""] * numOfColumns
    # The column and row variables point to where in the grid the next
    # character in the encrypted message will go.
    column = 0
    row = 0
    for symbol in message:
        plaintext[column] += symbol
        column += 1  # Point to next column.
        # If there are no more columns OR we're at a shaded box, go back to
        # the first column and the next row:
        if (column == numOfColumns) or (
            column == numOfColumns - 1 and row >= numOfRows - numOfShadedBoxes
        ):
            column = 0
            row += 1
    return "".join(plaintext)


# If transpositionDecrypt.py is run (instead of imported as a module) call
# the main() function.
if __name__ == "__main__":
    main()
TranspositionCiphers/transpositionDecrypt_cfg.svg

A version in Haskell:
TranspositionCiphers/TranspositionDecrypt.hs

#!/usr/bin/runghc

module TranspositionDecrypt (decryptString, numOfColumns) where

import Data.List
import Data.List.Split
import TranspositionEncrypt

-- |
-- >>> numOfColumns 8 "Cenoonommstmme oo snnio. s  s c "
-- 4
numOfColumns = numOfRows

-- | Actual dercyption function.
-- >>> decryptString 4 "Cenoonommstmme oo snnio. s  s c "
-- "Common sense is not so common.  "
decryptString :: Int -> String -> String
decryptString numCols = intercalate "" . transpose . chunksOf numCols . messagePad numCols

main :: IO ()
main = do
  let message = "Common sense is not so common."
  let key = 8
  putStrLn "Your encrypted message is:"
  putStrLn (encryptString key message)
  let numCols = numOfColumns key message
  putStrLn "Your decrypted message is:"
  putStrLn (decryptString numCols . encryptString key $ message)

1.4 Formal specs

X: 1, 2, 3… 26 (the symbol set)
Y: 1, 2, 3… 26 (the symbol set)
K: ~ message length / 2
E: ?
D: ?

Recall:
In general, a key is just a secret, which can take many forms, as input to the encryption function.

1.5 Code to test code

Meta-code is actually quite fun to write, for example:
unit tests, autograders, and crypto-cracks.

Trace transpositionTest.py
TranspositionCiphers/transpositionTest.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Test
# https://www.nostarch.com/crackingcodes (BSD Licensed)

import random, sys, transpositionEncrypt, transpositionDecrypt


def main() -> None:
    random.seed(42)  # set the random "seed" to a static value
    for i in range(20):  # run 20 tests
        # Generate random messages to test.
        # The message will have a random length:
        message = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" * random.randint(4, 40)
        # Convert the message string to a list to shuffle it.
        message_array = list(message)
        random.shuffle(message_array)
        message = "".join(message_array)  # convert list to string
        print('Test #%s: "%s..."' % (i + 1, message[:50]))
        # Check all possible keys for each message.
        for key in range(1, int(len(message) / 2)):
            encrypted = transpositionEncrypt.encryptMessage(key, message)
            print(encrypted)
            decrypted = transpositionDecrypt.decryptMessage(key, encrypted)
            # If the decryption doesn't match the original message, display
            # an error message and quit.
            if message != decrypted and message != encrypted:
                print("Mismatch with key %s and message %s." % (key, message))
                print("Decrypted as: " + decrypted)
                sys.exit()
    print("Transposition cipher test passed.")


# If transpositionTest.py is run (instead of imported as a module) call
# the main() function.
if __name__ == "__main__":
    main()
TranspositionCiphers/transpositionTest_cfg.svg

Here is a version of the same in Haskell:
TranspositionCiphers/TranspositionTest.hs

#!/usr/bin/runghc

import Test.QuickCheck
import TranspositionDecrypt
import TranspositionEncrypt

-- |
-- >>> quickCheck test
-- +++ OK, passed 100 tests.
test :: String -> Gen Bool
test message = do
  key <- choose (1, length message) :: Gen Int
  return (messagePad key message == (decryptString (numOfColumns key message) . encryptString key) message)

main :: IO ()
main = quickCheck test

What did we randomize, and parametrically test across?
Message content?
Message length?
What about strange characters?
Any other strange edge cases we should include?
How would you write a unit test for the encryption and decryption functions?
Unit testing on messy steroids: define fuzzing!

1.6 Detecting English

Check out dictionary.txt and detectEnglish.py.
TranspositionCiphers/dictionary.txt
TranspositionCiphers/detectEnglish.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Detect English module
"""
https://www.nostarch.com/crackingcodes (BSD Licensed)
To use, type this code:
>>> import detectEnglish
>>> detectEnglish.isEnglish(someString) # returns True or False
(There must be a "dictionary.txt" file in this directory with all English
words in it, one word per line. You can download this from
https://invpy.com/dictionary.txt)
"""

from typing import Dict

UPPERLETTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
LETTERS_AND_SPACE = UPPERLETTERS + UPPERLETTERS.lower() + " \t\n"


# Why use a dictionary here instead of a list?
# What would have been a better data structure, and less of a hack?
def loadDictionary() -> Dict[str, None]:
    dictionaryFile = open("dictionary.txt")
    # should have been a set...
    englishWords: Dict[str, None] = {}
    for word in dictionaryFile.read().split("\n"):
        englishWords[word] = None
    dictionaryFile.close()
    return englishWords


ENGLISH_WORDS = loadDictionary()


def getEnglishCount(message: str) -> float:
    message = message.upper()
    message = removeNonLetters(message)
    possibleWords = message.split()
    if possibleWords == []:
        return 0.0  # No words at all, so return 0.0.
    matches = 0
    for word in possibleWords:
        if word in ENGLISH_WORDS:
            matches += 1
    return float(matches) / len(possibleWords)


def removeNonLetters(message: str) -> str:
    lettersOnly = []
    for symbol in message:
        if symbol in LETTERS_AND_SPACE:
            lettersOnly.append(symbol)
    return "".join(lettersOnly)


def isEnglish(
    message: str, wordPercentage: int = 20, letterPercentage: int = 85
) -> bool:
    # By default, 20% of the words must exist in the dictionary file, and
    # 85% of all the characters in the message must be letters or spaces
    # (not punctuation or numbers).
    wordsMatch = getEnglishCount(message) * 100 >= wordPercentage
    numLetters = len(removeNonLetters(message))
    messageLettersPercentage = float(numLetters) / len(message) * 100
    lettersMatch = messageLettersPercentage >= letterPercentage
    return wordsMatch and lettersMatch
TranspositionCiphers/detectEnglish_cfg.svg

Mention:
Which function do we actually call when we import this?
Are the others potentially independent,
such that we could call them too?
Does any simple newline separated dictionary work?
What kind of encoding(s) does Python3 require in it’s text files?
Are there better ways to pick the best English candidate?
Does this always work?
What if nothing matches?

TranspositionCiphers/DetectEnglish.hs

module DetectEnglish where

import Data.Char
import Data.HashSet (HashSet)
import qualified Data.HashSet as HashSet
import Data.List
import System.IO

importDictionary :: IO (HashSet String)
importDictionary = do
  wholeFile <- readFile "dictionary.txt"
  return (HashSet.fromList (words wholeFile))

-- |
-- >>> isEnglish (HashSet.fromList ["HI", "THERE"]) 0.5 "hi there English"
-- True

-- |
-- >>> isEnglish (HashSet.fromList ["HI", "THERE"]) 0.2 "hi there English"
-- True
isEnglish :: HashSet String -> Float -> String -> Bool
isEnglish hashDict threshold message =
  let wordsList = words (map toUpper message)
      numEnglishWords = genericLength . filter id $ map (`HashSet.member` hashDict) wordsList
      numTotalWords = genericLength wordsList
   in threshold < numEnglishWords / numTotalWords

+++++++++++++++++++++++++++++++++
Cahoot_TranspositionCipher-Length

What if the key was 1-long?

++++++++++++++++++++
Discussion question:
Can you think of any reasons why merely detecting English sufficient to know when you have cracked a ciphertext?
How might you hide a message, despite having the cryptography having been cracked?

1.7 Cracking the transposition cipher

Check out transpositionHacker.py
TranspositionCiphers/transpositionHacker.py

#!/usr/bin/python3
# -*- coding: utf-8 -*-

# Transposition Cipher Hacker
# https://www.nostarch.com/crackingcodes (BSD Licensed)

import detectEnglish, transpositionDecrypt
from typing import Optional


def main() -> None:
    # You might want to copy & paste this text from the source code at
    # https://invpy.com/transpositionHacker.py
    myMessage = """AaKoosoeDe5 b5sn ma reno ora'lhlrrceey e  enlh na  \
indeit n uhoretrm au ieu v er Ne2 gmanw,forwnlbsya apor tE.no euarisfa\
tt  e mealefedhsppmgAnlnoe(c -or)alat r lw o eb  nglom,Ain one dtes il\
hetcdba. t tg eturmudg,tfl1e1 v  nitiaicynhrCsaemie-sp ncgHt nie cetrg\
mnoa yc r,ieaa  toesa- e a0m82e1w shcnth  ekh gaecnpeutaaieetgn iodhso\
 d ro hAe snrsfcegrt NCsLc b17m8aEheideikfr aBercaeu thllnrshicwsg etr\
iebruaisss  d iorr."""
    hackedMessage = hackTransposition(myMessage)
    if hackedMessage is None:
        print("Failed to hack encryption.")
    else:
        print("Copying hacked message to clipboard:")
        print(hackedMessage)


def hackTransposition(message: str) -> Optional[str]:
    print("Hacking...")
    # Python programs can be stopped at any time by pressing Ctrl-C (on
    # Windows) or Ctrl-D (on Mac and Linux)
    print("(Press Ctrl-C or Ctrl-D to quit at any time.)")
    # Brute-force by looping through every possible key.
    for key in range(1, len(message)):
        print("Trying key #%s..." % (key))
        decryptedText = transpositionDecrypt.decryptMessage(key, message)
        if detectEnglish.isEnglish(decryptedText):
            # Ask user if this is the correct decryption.
            print()
            print("Possible encryption hack:")
            print("Key %s: %s" % (key, decryptedText[:100]))
            print()
            print("Enter D if done, anything else to continue hacking:")
            response = input("> ")
            if response.strip().upper().startswith("D"):
                return decryptedText
    return None


if __name__ == "__main__":
    main()
TranspositionCiphers/transpositionHacker_cfg.svg

A version in Haskell:
TranspositionCiphers/TranspositionHacker.hs

#!/usr/bin/runghc

import Data.HashSet (HashSet)
import Data.List
import Data.Maybe
import Debug.Trace (trace)
import DetectEnglish
import TranspositionDecrypt
import TranspositionEncrypt

-- This is a monolithing recursive way of solving the problem:

-- |
-- >>> hackTransposition ["HI", "THERE"] 0.2 (encryptString 6 "hi there transposition")
-- "hi there transposition  "
hackTransposition :: HashSet String -> Float -> String -> (Int, String)
hackTransposition = hackTranspositionRec 1

-- |
-- >>> hackTranspositionRec 1 ["HI", "THERE"] 0.2 (encryptString 6 "hi there transposition")
-- "hi there transposition  "
hackTranspositionRec :: Int -> HashSet String -> Float -> String -> (Int, String)
hackTranspositionRec key wordList threshold message =
  let decrypted = decryptString (numOfColumns key message) message
      english = isEnglish wordList threshold decrypted
   in if english
        then (key, decrypted)
        else trace (show (key, decrypted)) $ hackTranspositionRec (key + 1) wordList threshold message

-- The above design does not have a "not-found" condition.
-- An alternative design is below:
-- use `find` with decryptSuccess predicate for single message, across [1..]

-- |
-- >>> decryptSuccess 6 ["HI", "THERE"] 0.2 (encryptString 6 "hi there transposition")
-- Just (6,"hi there transposition  ")

-- |
-- >>> decryptSuccess 1 ["HI", "THERE"] 0.2 (encryptString 6 "hi there transposition")
-- Nothing
decryptSuccess :: Int -> HashSet String -> Float -> String -> Bool
decryptSuccess key wordList threshold message =
  let decrypted = decryptString (numOfColumns key message) message
   in isEnglish wordList threshold decrypted

main :: IO ()
main = do
  hashDict <- importDictionary
  let encryptedMessage = encryptString 6 "This is a long message that is in English, and it will be cracked."
  print (hackTransposition hashDict 0.2 encryptedMessage)
  putStrLn "Another method, key was: "
  case find (\x -> decryptSuccess x hashDict 0.2 encryptedMessage) [1 ..] of
    Just crackedKey -> print (crackedKey, decryptString (numOfColumns crackedKey encryptedMessage) encryptedMessage)
    Nothing -> putStrLn "No key found"

1.8 Dealing with files

On your own time, it’d be worth tracing these, for the experience of dealing with larger files to read in and out when encrypting/decrypting/cracking:
TranspositionCiphers/transpositionFileCipher.py
TranspositionCiphers/transpositionFileHacker.py
TranspositionCiphers/frankenstein.txt
TranspositionCiphers/frankenstein.encrypted.txt
TranspositionCiphers/frankenstein.decrypted.txt