Previous: 02-IntroCryptoCaesar.html
Q: How can you tell the difference between an encrypted joke
and a random string of characters?
A: You can’t. They’re indistinguishable without the key.
document.querySelector('video').playbackRate = 1.2
http://inventwithpython.com/cracking/ Chapters 7-12
https://en.wikipedia.org/wiki/Transposition_cipher
A transposition cipher is a method of encryption,
by which the positions held by units of plaintext are shifted according
to a regular system,
so that the ciphertext constitutes a permutation of the plaintext.
Plaintext units being shifted are commonly characters or groups of
characters.
The order of the units is changed (the plaintext is reordered).
Some variations of this algorithm can actually be quite strong and
secure.
It comes in many variations:
Rail Fence cipher (zig-zag)
Route cipher
Columnar transposition (today)
Double transposition
Myszkowski transposition
Disrupted transposition
Scytale (this wood and leather thing below)
Grilles (masking, like the matrix below)
In “the old days” a spy could embed a message in a newspaper,
and apply a mask to it!
Important note:
This, like the other basic algorithms we are covering,
might seem like archaic toys.
However, the principles and direct matrix operations resemble those
employed in AES,
and other strong, valid, modern encryption algorithms,
which employ substitution and transformation in similar ways to this
algorithm!
Further, variations on this masking paradigm could be
exceptionally strong,
because the set of masks in the human world is so huge.
A key is any secret used as input to an encryption algorithm,
with the plaintext message, to produce the ciphertext.
In this case, a key of 8 defines a row length of 8:
Write in the message, with spaces, wrapping into a matrix that is K wide
(8 here):
Common sense is not so common.
… and ignore the empty boxes at the end.
To formalize for computation, number the positions,
starting at 0.
Notice a theme in formalizing?
To encrypt, read column-wise, rather than row-wise:
Cenoonommstmme oo snnio. s s c
How to structure this transformation with a loop?
Which loop, out or inner, correspond to each of the pictures below?
Inner or outer loop?
Inner our outer loop?
Could this have been coded differently?
Trace transpositionEncrypt.py
03-TranspositionCiphers/transpositionEncrypt.py
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Transposition Cipher Encryption
# https://www.nostarch.com/crackingcodes (BSD Licensed)
def main() -> None:
myMessage = "Common sense is not so common."
myKey = 8
ciphertext = encryptMessage(myKey, myMessage)
# Print the encrypted string in ciphertext to the screen,
# with a | ("pipe" character) after it,
# in case there are spaces at # the end of the encrypted message.
print(ciphertext + "|")
def encryptMessage(key: int, message: str) -> str:
# Each string in ciphertext represents a column in the grid.
ciphertext = [""] * key
# Loop through each column in ciphertext.
for column in range(key):
currentIndex = column
# Keep looping until currentIndex goes past the message length.
while currentIndex < len(message):
# Place the character at currentIndex in message at the
# end of the current column in the ciphertext list.
ciphertext[column] += message[currentIndex]
# move currentIndex over
currentIndex += key
# Convert the ciphertext list into a single string value and return it.
return "".join(ciphertext)
# If transpositionEncrypt.py is run (instead of imported as a module) call
# the main() function.
if _name_ == "_main_":
main()
++++++++++++++++++++
Cahoot-03.1
To find the number of columns for a particular message,
divide the length of the ciphertext message by the key,
and if the result isn’t already a whole number,
then round up to the nearest whole number.
The length of the ciphertext is 30 characters
(the same as the plaintext) and the key is 8,
so 30 divided by 8 is 3.75, rounded up to 4.
How to loop through the below structure?
Does the operation in code resemble encryption?
Is it identical?
Trace transpositionDecrypt.py
03-TranspositionCiphers/transpositionDecrypt.py
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Transposition Cipher Decryption
# https://www.nostarch.com/crackingcodes (BSD Licensed)
import math
def main() -> None:
myMessage = "Cenoonommstmme oo snnio. s s c"
myKey = 8
plaintext = decryptMessage(myKey, myMessage)
# Print with a | ("pipe" character) after it in case
# there are spaces at the end of the decrypted message.
print(plaintext + "|")
def decryptMessage(key: int, message: str) -> str:
# The transposition decrypt function will simulate the "columns" and
# "rows" of the grid that the plaintext is written on by using a list
# of strings. First, we need to calculate a few values.
# The number of "columns" in our transposition grid:
numOfColumns = int(math.ceil(len(message) / float(key)))
# The number of "rows" in our grid will need:
numOfRows = key
# The number of "shaded boxes" in the last "column" of the grid:
numOfShadedBoxes = (numOfColumns * numOfRows) - len(message)
# Each string in plaintext represents a column in the grid.
plaintext = [""] * numOfColumns
# The column and row variables point to where in the grid the next
# character in the encrypted message will go.
column = 0
row = 0
for symbol in message:
plaintext[column] += symbol
column += 1 # Point to next column.
# If there are no more columns OR we're at a shaded box, go back to
# the first column and the next row:
if (column == numOfColumns) or (
column == numOfColumns - 1 and row >= numOfRows - numOfShadedBoxes
):
column = 0
row += 1
return "".join(plaintext)
# If transpositionDecrypt.py is run (instead of imported as a module) call
# the main() function.
if _name_ == "_main_":
main()
++++++++++++++++++++
Cahoot-03.2
X: 1, 2, 3… 26 (the symbol set)
Y: 1, 2, 3… 26 (the symbol set)
K: ~ message length / 2
E: ?
D: ?
Recall:
In general, a key is just a secret, which can take many forms, as input
to the encryption function.
Meta-code is actually quite fun to write, for example:
unit tests, autograders, and crypto-cracks.
Trace transpositionTest.py
03-TranspositionCiphers/transpositionTest.py
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Transposition Cipher Test
# https://www.nostarch.com/crackingcodes (BSD Licensed)
import random, sys, transpositionEncrypt, transpositionDecrypt
def main() -> None:
random.seed(42) # set the random "seed" to a static value
for i in range(20): # run 20 tests
# Generate random messages to test.
# The message will have a random length:
message = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" * random.randint(4, 40)
# Convert the message string to a list to shuffle it.
message_array = list(message)
random.shuffle(message_array)
message = "".join(message_array) # convert list to string
print('Test #%s: "%s..."' % (i + 1, message[:50]))
# Check all possible keys for each message.
for key in range(1, int(len(message) / 2)):
encrypted = transpositionEncrypt.encryptMessage(key, message)
print(encrypted)
decrypted = transpositionDecrypt.decryptMessage(key, encrypted)
# If the decryption doesn't match the original message, display
# an error message and quit.
if message != decrypted:
print("Mismatch with key %s and message %s." % (key, message))
print("Decrypted as: " + decrypted)
sys.exit()
print("Transposition cipher test passed.")
# If transpositionTest.py is run (instead of imported as a module) call
# the main() function.
if _name_ == "_main_":
main()
What did we randomize, and parametrically test across?
Message content?
Message length?
What about strange characters?
Any other strange edge cases we should include?
How would you write a unit test for the encryption and decryption
functions?
Unit testing on messy steroids: define fuzzing!
Check out detectEnglish.py and dictionary.txt
03-TranspositionCiphers/detectEnglish.py
03-TranspositionCiphers/dictionary.txt
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Detect English module
"""
https://www.nostarch.com/crackingcodes (BSD Licensed)
To use, type this code:
>>> import detectEnglish
>>> detectEnglish.isEnglish(someString) # returns True or False
(There must be a "dictionary.txt" file in this directory with all English
words in it, one word per line. You can download this from
https://invpy.com/dictionary.txt)
"""
from typing import Dict
UPPERLETTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
LETTERS_AND_SPACE = UPPERLETTERS + UPPERLETTERS.lower() + " \t\n"
# Why use a dictionary here instead of a list?
# What would have been a better data structure, and less of a hack?
def loadDictionary() -> Dict[str, None]:
dictionaryFile = open("dictionary.txt")
# should have been a set...
englishWords: Dict[str, None] = {}
for word in dictionaryFile.read().split("\n"):
englishWords[word] = None
dictionaryFile.close()
return englishWords
ENGLISH_WORDS = loadDictionary()
def getEnglishCount(message: str) -> float:
message = message.upper()
message = removeNonLetters(message)
possibleWords = message.split()
if possibleWords == []:
return 0.0 # No words at all, so return 0.0.
matches = 0
for word in possibleWords:
if word in ENGLISH_WORDS:
matches += 1
return float(matches) / len(possibleWords)
def removeNonLetters(message: str) -> str:
lettersOnly = []
for symbol in message:
if symbol in LETTERS_AND_SPACE:
lettersOnly.append(symbol)
return "".join(lettersOnly)
def isEnglish(
message: str, wordPercentage: int = 20, letterPercentage: int = 85
) -> bool:
# By default, 20% of the words must exist in the dictionary file, and
# 85% of all the characters in the message must be letters or spaces
# (not punctuation or numbers).
wordsMatch = getEnglishCount(message) * 100 >= wordPercentage
numLetters = len(removeNonLetters(message))
messageLettersPercentage = float(numLetters) / len(message) * 100
lettersMatch = messageLettersPercentage >= letterPercentage
return wordsMatch and lettersMatch
Mention:
Which function do we actually call when we import this?
Are the others potentially independent,
such that we could call them too?
Does any simple newline separated dictionary work?
What kind of encoding(s) does Python3 require in it’s text files?
Are there better ways to pick the best English candidate?
Does this always work?
What if nothing matches?
++++++++++++++++++++
Cahoot-03.3
What if the key was 1-long?
++++++++++++++++++++
Discussion question:
Can you think of any reasons why merely detecting English sufficient to
know when you have cracked a ciphertext?
How might you hide a message, despite having the cryptography having
been cracked?
Check out transpositionHacker.py
03-TranspositionCiphers/transpositionHacker.py
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Transposition Cipher Hacker
# https://www.nostarch.com/crackingcodes (BSD Licensed)
import detectEnglish, transpositionDecrypt
from typing import Optional
def main() -> None:
# You might want to copy & paste this text from the source code at
# https://invpy.com/transpositionHacker.py
myMessage = """AaKoosoeDe5 b5sn ma reno ora'lhlrrceey e enlh na \
indeit n uhoretrm au ieu v er Ne2 gmanw,forwnlbsya apor tE.no euarisfa\
tt e mealefedhsppmgAnlnoe(c -or)alat r lw o eb nglom,Ain one dtes il\
hetcdba. t tg eturmudg,tfl1e1 v nitiaicynhrCsaemie-sp ncgHt nie cetrg\
mnoa yc r,ieaa toesa- e a0m82e1w shcnth ekh gaecnpeutaaieetgn iodhso\
d ro hAe snrsfcegrt NCsLc b17m8aEheideikfr aBercaeu thllnrshicwsg etr\
iebruaisss d iorr."""
hackedMessage = hackTransposition(myMessage)
if hackedMessage == None:
print("Failed to hack encryption.")
else:
print("Copying hacked message to clipboard:")
print(hackedMessage)
def hackTransposition(message: str) -> Optional[str]:
print("Hacking...")
# Python programs can be stopped at any time by pressing Ctrl-C (on
# Windows) or Ctrl-D (on Mac and Linux)
print("(Press Ctrl-C or Ctrl-D to quit at any time.)")
# Brute-force by looping through every possible key.
for key in range(1, len(message)):
print("Trying key #%s..." % (key))
decryptedText = transpositionDecrypt.decryptMessage(key, message)
if detectEnglish.isEnglish(decryptedText):
# Ask user if this is the correct decryption.
print()
print("Possible encryption hack:")
print("Key %s: %s" % (key, decryptedText[:100]))
print()
print("Enter D if done, anything else to continue hacking:")
response = input("> ")
if response.strip().upper().startswith("D"):
return decryptedText
return None
if _name_ == "_main_":
main()
On your own time, it’d be worth tracing these, for the experience of
dealing with larger files to read in and out when
encrypting/decrypting/cracking:
03-TranspositionCiphers/transpositionFileCipher.py
03-TranspositionCiphers/transpositionFileHacker.py
03-TranspositionCiphers/frankenstein.txt
03-TranspositionCiphers/frankenstein.encrypted.txt
03-TranspositionCiphers/frankenstein.decrypted.txt
Next: 04-AffineCipher.html