« Back to archives :: *06 Oct 2013* :: Categories: **useless**, **encryption**

Another useless encryption scheme devised by yours truly. While my last one (pi
code) was primarily a substitution cipher, this one transcends
all standard classifications; it's almost like a transposition cipher, but not
*really*. The main idea here is the use of certain numbers in a particular
Sudoku grid. The strength (if any) in this method lies in its unexpected nature;
it certainly takes quite a leap of the imagination to correctly deduce the
method from the ciphertext. Of course, once the method has been discovered,
deciphering merely involves solving a Sudoku grid and then figuring out the
substitution cipher used, meaning that the key is easy to determine and so this
method kind of just looks at Kerckhoffs's Principle and then keeps walking. But
that's okay, that's why this is filed under **useless**.

### The setup

The first step is to find a Sudoku puzzle. I happened to have a Sudoku game installed, so I just ran that and started a random puzzle, which looked like this:

5 | 1 | 7 | 8 | |||||

3 | 1 | 9 | 5 | 4 | ||||

3 | 1 | 5 | ||||||

2 | 8 | |||||||

4 | 7 | 5 | 9 | |||||

9 | 4 | |||||||

6 | 5 | 2 | ||||||

2 | 6 | 5 | 4 | 8 | ||||

4 | 3 | 7 | 2 |

The (unique) solution to the above puzzle looks like this:

5 | 2 | 4 | 1 | 7 | 9 | 3 | 8 | 6 |

3 | 1 | 9 | 6 | 5 | 8 | 4 | 2 | 7 |

8 | 6 | 7 | 3 | 2 | 4 | 9 | 1 | 5 |

2 | 9 | 5 | 4 | 8 | 3 | 6 | 7 | 1 |

4 | 8 | 6 | 7 | 1 | 5 | 2 | 3 | 9 |

1 | 7 | 3 | 2 | 9 | 6 | 8 | 5 | 4 |

6 | 5 | 1 | 8 | 4 | 2 | 7 | 9 | 3 |

7 | 3 | 2 | 9 | 6 | 1 | 5 | 4 | 8 |

9 | 4 | 8 | 5 | 3 | 7 | 1 | 6 | 2 |

Clearly, each solved grid can be represented by a sequence of 81 numbers. Furthermore, due to the uniqueness of the solution, the initial grid and the solved grid are equivalent in the sense that the initial grid uniquely determines the solved grid. So we represent the initial grid as follows (with 0s standing for spaces):

`500170080319050400000300015200080000400705009000090004650002000002060548040037002`

This results in the following solution grid:

`524179386319658427867324915295483671486715239173296854651842793732961548948537162`

Now, depending on the length of the plaintext, we can choose somewhere between one and five numbers to use as the "holes" in the grid. For example, if we had a plaintext of 45 characters or fewer, we could choose the numbers 1, 2, 5, 6 and 9. Removing those numbers from the grid would leave us with this:

4 | 7 | 3 | 8 | |||||

3 | 8 | 4 | 7 | |||||

8 | 7 | 3 | 4 | |||||

4 | 8 | 3 | 7 | 1 | ||||

4 | 8 | 7 | 3 | |||||

7 | 3 | 8 | 4 | |||||

8 | 4 | 7 | 3 | |||||

7 | 3 | 6 | 4 | 8 | ||||

4 | 8 | 3 | 7 |

This has an entirely different arrangement of empty spaces than the initial grid. We can represent it using the following string:

`50017008031905040000030001520008000040070500900009000465000200000206054804003700212569`

We can then use the empty spaces to write our message, and add characters in the rest of the spaces to try and flatten frequencies. A simple substitution cipher should be used as well to prevent the ciphertext from looking too much like a transposition cipher. The alphabetic characters can then be interspersed with the numerals above, with the resulting string separated into chunks of 32 characters (with extra alphabetic characters appended as necessary, since only 81 are needed) so as to superficially resemble MD5 (because that's just fun).

Optimally, the solutions to the possible Sudoku grids being used should be stored in a dictionary somewhere, so that each grid only has to be solved once. The point of this encryption scheme is to be diabolically tricky but solvable, although perhaps only if you figure out the Sudoku aspect and know a crib or two.

### Encoding and decoding a sequence

Let's say you want to encode:

`admiral yamamoto to visit solomon islands eight am`

First, remove spaces:

`admiralyamamototovisitsolomonislandseightam`

This is 43 characters, so let's aim for one grid with 5 numbers. This means there will be 2 extra invalid characters and 36 nonsense characters. Since some spaces will be fairly large, it's a good idea to use a simple substitution cipher as well (or even a less simple one like VigenĂ¨re for additional fiendishness). In this case, we'll use the reverse alphabet cipher:

`abcdefghijklm`

`zyxwvutsrqpon`

So the encrypted plaintext becomes

`admiralyamamototovisitsolomonislandseightam`

`zwnrizobznznlglglerhrghlolnlmrhozmwhvrtsgzn`

Now put it in the grid:

z | w | 4 | n | 7 | r | 3 | 8 | i |

3 | z | o | b | z | 8 | 4 | n | 7 |

8 | z | 7 | 3 | n | 4 | l | g | l |

g | l | e | 4 | 8 | 3 | r | 7 | h |

4 | 8 | r | 7 | g | h | l | 3 | o |

l | 7 | 3 | n | l | m | 8 | r | 4 |

h | o | z | 8 | 4 | m | 7 | w | 3 |

7 | 3 | h | v | r | t | s | 4 | 8 |

g | 4 | 8 | z | 3 | 7 | n | c | a |

(The last two characters are the extra invalid characters used to fill up the empty spaces.)

Now replace the remaining numbers with random letters, chosen so as to flatten the letter frequencies as much as possible:

z | w | e | n | a | r | o | t | i |

q | z | o | b | z | a | e | n | s |

f | z | t | r | n | m | l | g | l |

g | l | e | e | a | s | r | h | h |

t | r | r | a | g | h | l | m | o |

l | c | c | n | l | m | z | r | v |

h | o | z | l | p | m | p | w | p |

a | e | h | v | r | t | s | b | d |

g | g | f | z | e | q | n | c | a |

Written as one line, the above grid would look like:

`zwenarotiqzobzaensfztrnmlglgleeasrhhtrraghlmolccnlmzrvhozlpmpwpaehvrtsbdggfzeqnca`

Recall that the original grid looks as follows:

`500170080319050400000300015200080000400705009000090004650002000002060548040037002`

With the numbers used added to the end of the grid, it might look like this:

`50017008031905040000030001520008000040070500900009000465000200000206054804003700215926`

Now, we randomly combine the two: (split across two lines here due to design considerations)

`z50w0ena1r7o008tiq0zo3bz1ae9nsf0z5t0r4n0mlgl0g0le0e0as3rhh00tr0rag1h5lm2ol00ccn0l8mz`

`r0v0ho0z0400lp7mpwp0a500eh9v0r0t0sbd090gg0046f500ze02000q0020n6054c80400a3700215926`

We could even put in extra characters at the end, since only the first 81 characters will be used:

`z50w0ena1r7o008tiq0zo3bz1ae9nsf0z5t0r4n0mlgl0g0le0e0as3rhh00tr0rag1h5lm2ol00ccn0l8mzr0v`

`0ho0z0400lp7mpwp0a500eh9v0r0t0sbd090gg0046f500ze02000q0020n6054c80400a3f70gg02a1e59f26x`

Here's how it would look if we split it up into 32-character chunks:

`z50w0ena1r7o008tiq0zo3bz1ae9nsf0`

`z5t0r4n0mlgl0g0le0e0as3rhh00tr0r`

`ag1h5lm2ol00ccn0l8mzr0v0ho0z0400`

`lp7mpwp0a500eh9v0r0t0sbd090gg004`

`6f500ze02000q0020n6054c80400a3f7`

`0gsssg02ea1e5dllg9famezelr2ezz6x`

And there you go. Not indecipherable but certainly very misleading. To decipher, first separate the numerals from the alphabetic characters, then write out the first 81 characters within a 9x9 grid. Then fill another 9x9 grid with the first 81 numerals (one digit per box), omitting the zeros, and solve the resulting Sudoku puzzle. Take the remaining (non-zero) digits and highlight the positions in the grid corresponding to those digits. The characters corresponding to the non-highlighted positions can be discarded. Now it only remains to reverse the substitution cipher and insert spaces where necessary.

Challenge:

`dszb5c0ygo01ize7c0u0n80s31rwq9m0`

`5f0400ll0r00mw30atdx00r15gzs2000`

`800jq0z0n400bz7j05obo00rtbt90u0v`

`f0u0oo9r0v00pmfvzoxxmg4650f00g20`

`0mnz000o20rgo605ub48ubzxxo040a03`

`70kteo02osz3dfle4eraosme2tn5sddq`

### Implementation in Python

All the relevant code is available in the repository for this website, under code/sudokucode. The 3 main files are shown below.

#### utils.py

```
import json
import re
import random
import string
A_ORD = ord('a')
NOT_LETTERS_RE = re.compile('[^a-z]')
LETTERS_RE = re.compile('[a-z]')
NUMBERS_RE = re.compile('[0-9]')
POSSIBLE_DIGITS = map(str, range(1, 10))
# Does the reverse alphabet cipher thing on a string`
def reverse_cipher(plaintext):
real_letters = []
for letter in plaintext:
real_letters.append(chr(A_ORD + 25 - (ord(letter) - A_ORD)))
# whatever it works don't h8
return ''.join(real_letters)
def contains_only_letters(plaintext):
return NOT_LETTERS_RE.search(plaintext) is None
def sometimes():
return random.choice((True, False))
class SudokuCoder:
def __init__(self):
try:
self.grids = json.load(open('grids.json'))
except IOError:
# No grids in memory - can only encode, not decode.
self.grids = {}
def encode(self, plaintext):
plaintext = plaintext.lower()
num_letters = len(plaintext)
letters = reverse_cipher(plaintext)
if not contains_only_letters(plaintext):
raise CannotEncodeError('Can only contain alphabetic characters.')
if num_letters > 45:
raise CannotEncodeError('Maximum plaintext length: 45 characters.')
# Randomly choose a grid.
initial_grid = random.choice(self.grids.keys())
grid = self.grids[initial_grid]
# Randomly choose some digits to use as the holes.
num_digits = num_letters / 9 + 1
digits = random.sample(POSSIBLE_DIGITS, num_digits)
letter_index = 0
new_grid = []
# Now replace all the hole digits with the plaintext.
for digit in grid:
if digit in digits and letter_index < num_letters:
new_grid.append(letters[letter_index])
letter_index += 1
else:
# Choose a random character
# For both the extra ones and the nonsense ones
new_grid.append(random.choice(string.lowercase))
# Add extra characters depending on the number of digits
for digit in xrange(num_digits):
new_grid.append(random.choice(string.lowercase))
total_digits = initial_grid + ''.join(digits)
# Now randomly combine them.
grid_length = len(new_grid)
total_length = grid_length * 2
ciphertext = []
letter_index = 0
digit_index = 0
while letter_index < grid_length or digit_index < len(total_digits):
if ((sometimes() and letter_index < grid_length) or
digit_index == len(total_digits)):
ciphertext.append(new_grid[letter_index])
letter_index += 1
else:
ciphertext.append(total_digits[digit_index])
digit_index += 1
return ''.join(ciphertext)
def decode(self, ciphertext):
ciphertext = ciphertext.lower()
# Get the grid numbers (the first 81 digits).
all_numbers = NUMBERS_RE.findall(ciphertext)
initial_grid = ''.join(all_numbers[:81])
hole_numbers = all_numbers[81:]
all_letters = LETTERS_RE.findall(ciphertext)
grid_letters = all_letters[:81]
# Check if the solution to this initial grid exists.
if not initial_grid in self.grids:
raise GridNotFoundError
# Get the list indices of the hole numbers.
solution_grid = self.grids[initial_grid]
hole_indices = []
for i in xrange(len(solution_grid)):
if solution_grid[i] in hole_numbers:
hole_indices.append(i)
hole_letters = [grid_letters[index] for index in hole_indices]
plaintext = reverse_cipher(hole_letters)
return plaintext
def add_grid(self, initial, solution):
if not initial in self.grids:
self.grids[initial] = solution
json.dump(self.grids, open('grids.json', 'w'))
else:
raise GridAlreadyExists
class GridNotFoundError(Exception):
pass
class CannotEncodeError(Exception):
pass
class GridAlreadyExists(Exception):
pass
```

#### encode.py

```
import sys
import utils
if len(sys.argv) == 2:
sudoku_coder = utils.SudokuCoder()
print sudoku_coder.encode(sys.argv[1])
else:
print "Usage: python encode.py [plaintext]"
```

#### decode.py

```
import sys
import utils
if len(sys.argv) == 2:
sudoku_coder = utils.SudokuCoder()
print sudoku_coder.decode(sys.argv[1])
else:
print "Usage: python decode.py [ciphertext]"
```

### Alternatives

Instead of encrypting the Sudoku puzzle itself, use a standard one. for example, if you and your imaginary recipient both have access to the New York Times, which we will pretend has a daily Sudoku puzzle if it doesn't actually, then just use that. Then, the only numbers you'd have to send along with the plaintext (and the extra characters) are the hole digits. This means that the key itself (or at least the prelude to the key) does not have to be distributed along with the message, resulting in a somewhat more secure scheme (for a given value of "secure"). This will work as long as you ensure that the Sudoku puzzle that you choose always has a unique solution.