Skip to main content

Subsection 3.1 Preliminary Activity

There are many exercises below in which students are asked to analyze plaintext or ciphertext where said text is not specified. These texts may be provided the instructor based on the cryptographic methods described below. This flexibility is emphasized in order to allow customization of the specific texts students will play with, including customizing ciphertexts to a college or university’s context.
It is useful to establish some cryptographic terminology which we will use for the remainder of this text. To that end, consider the definitions below:
Cryptography (from Greek kryptos-, meaning “hidden”) is secret communication by hiding the meaning of a message, not its existence. The process of hiding the meaning of a message is called encryption. The recipient of the message has to decrypt or decipher the message in order to read it.
Often the method of encryption relies on a key, some special number(s) or word(s) that only the sender and recipient know. Cryptanalysis is the study of cryptographic algorithms with the intent of recovering secret messages without knowing the secret key.
There are two basic tools that can be used in encryption algorithms: transposition (rearranging the characters) and substitution (replacing characters with other characters). The emphasis of these notes will be on substitution ciphers.
A (substitution) cipher is an algorithm for encrypting a message into apparently unintelligible text. Each cipher may be viewed as a function
\begin{equation*} f:\mathcal{P}\to\mathcal{C} \end{equation*}
where \(\mathcal{P}\) is the space of plaintext, or readable text in a given alphabet (here the English alphabet consisting of the \(26\) lowercase English letters will be used unless specified otherwise) and \(\mathcal{C}\) is the space of ciphertext or encrypted text (here the uppercase \(26\)-letter English alphabet unless otherwise specified). We also identify each letter in \(\mathcal{P}\) and \(\mathcal{C}\text{,}\) in order, with an element of \(\mathbb{Z}/26\mathbb{Z}\) (so that \(a\leftrightarrow0,b\leftrightarrow1\text{,}\) etc.). It will be clear from context which representation of \(\mathcal{P}\) and \(\mathcal{C}\) we are using. If \(f\) is injective (and hence, since here \(\mathcal{P}\) and \(\mathcal{C}\) are finite, bijective), we say it is a monoalphabetic (substitution) cipher.
The use of uppercase in the ciphertext and lowercase in the plaintext is a cryptographic convention designed to distinguish the two types of text, especially in partially-deciphered messages; we will often ignore the distinction and say, for example, “\(f(p)=p+3\) is the shift cipher with shift \(+3\)” where lowercase \(p\in\mathcal{P},c\in\mathcal{C}\) denote individual letters. We will commonly refer to a particular \(p\in\mathcal{P}\) which we will call simply the plaintext and its image \(c:=f(p)\in\mathcal{C}\) (the ciphertext). Notably, for courses without mathematical prerequisites such as that taught by the other using the flow of concepts below, the spaces and notation \(\mathcal{P}\) and \(\mathcal{C}\) are hidden, and ciphers are analyzed by analyzing particular \(p\) and \(f(p)\text{,}\) as well as determining \(f^{-1}(c)\) given particular \(c\in\mathcal{C}\text{.}\)
1. One may begin by simply giving students various encrypted messages to decode along with a handful of hints. Ask students to work in groups, with one member of each group of 3-4 students serving as the facilitator, the recorder, and the reporter (sharing their work with the class), respectively. For instructors following the order of topics below, it is important that at least one message be a shift cipher, a cipher of form \(f(p)=p+k\mod26\) for some \(k\in\mathbb{Z}/26\mathbb{Z}\text{.}\) In order for students to be able to systematically decrypt a shift cipher, the message should either be fairly long (students can split the message into pieces when decrypting it) or otherwise rigged so that plaintext ‘e’ is one of the top three most common letters in the plaintext.
It is recommended to give students two other messages as well. At this stage students are often not prepared for ciphers \(f\) which are not bijective, so one may use, for example, a keyword cipher
\begin{equation*} f=\begin{pmatrix}a & b & c & d & e & f & g & h & \dots & z \\ K & E & Y & W & O & R & D & A & \dots & Z \end{pmatrix} \end{equation*}
where the keyword is a chosen English word and the plaintext, again, is long enough or otherwise rigged so that the plaintext letters occur with similar relative frequencies to those of English as a whole. Note the following features of the keyword cipher: first the keyword is spelled out under the first \(n\) letters of the plaintext alphabet, where \(n\) is the number of distinct letters in the keyword. Then the alphabet is spelled out in order, except that when a letter has already been written as part of the keyword, it is skipped and we move onto the next letter of the alphabet.
The above keyword cipher illustrates a cipher alphabet in function form: a representation of the ordered pair \((\mathcal{P},f(\mathcal{P}))\) for a given cipher \(f\text{.}\) Of course, any other representation of \((\mathcal{P},f(\mathcal{P}))\text{,}\) such as a table of values, will do (and tables of values are commonly used to represent cipher alphabets). Students often decrypt such a message without identifying the keyword, but it is fruitful to point out that identifying a pattern will assist in the decryption process at this stage.
Finally, it is recommended to assign one unsolved cipher such as that used in Part \(4\) of the Kryptos sculpture [cross-reference to target(s) "bauerJamesSanbornKryptos2016" missing or not unique]. Pass on (unattributed) to the students the hints that artist Jim Sanborn gave: the ciphertext letters NYPVTT decrypt to BERLIN, and the word immediately following BERLIN is CLOCK. Then, give students time in groups to attempt to solve all three ciphers. Usually many students solve the shift cipher, a good handful (with my assistance) solve the keyword cipher, and generally these students move onto Kryptos, notice it’s not monoalphabetic since (by the hints) \(i,n\mapsto T\text{,}\) get frustrated, wonder about spacing, and come to a loss. At this stage one may tell them something like, “this is an unsolved cipher known as Kryptos Part \(4\text{;}\) solve it and you’ll be famous!”
Follow up by letting them know that we are not trying to be cruel, then ask them why we would give them an unsolved code. Based on our conversation, one may follow up with statements like, “What do you think mathematicians spend most of their time doing? How do you define “success” when working on an unsolved problem?” We emphasize that making sense of a problem, increasing their depth of understanding (what methods DON’T work?), describing their process of engagement, and even failing are all completely normal aspects of doing mathematics. Even problems that are solved often took years or even millennia to solve; therefore, there is no shame or cause for alarm if they don’t understand something immediately. This helps ground students as we move onto more and more abstract and difficult mathematics.
Now, ask students what strategies they used to decrypt the messages above. For the first message, students may use brute force, simply trying various shifts until one works. However, a more interesting strategy involves counting which ciphertext letter is the most frequent and guessing that letter corresponds to plaintext ‘e’.