Math 4100 Lecture Notes on Cryptography

1 Some old cryptographic techniques
2 RSA

1 Some old cryptographic techniques

1.1 Caesar ciphers

Shift all the letters of the alphabet a fixed amount, wrapping around if necessary.
Do this to each letter in the message.
E.g., shifting by 3 means that A is replaced by D, B is replaced by E, C is replaced by F, and so on. Towards the end of the alphabet we have that V is replaced by Y, and W is replaced by Z. For the remaining letters we wrap back around to the front of the alphabet so that X is replaced by A, Y is replaced by B, and Z is replaced by C.
Notice this is really just arithmetic mod 26.
As an example, if we encrypt the string ALAS POOR YORICK by shifting each letter by 3 we have
```
ALAS POOR YORICK
DODV SRRU BRULFN
```
To decrypt, we simply shift backwards three spaces. So D is replaced by A, E is replaced by B, and so forth.
We can then easily decrypt L NQHZ KLP KRUDWLR:
```
L NQHZ KLP KRUDWLR
I KNEW HIM HORATIO
```
Supposedly this cipher was used by Julis Caesar and for this reason this cipher is called a Caesar cipher.
It's extremely easy to break messages encrypted using a Caesar cipher, even if we do not know the key (e.g., even if we don't know how much each letter has been shifted by). If we have enough ciphertext to look at, we can perform a frequency analysis. Frequencies of English letters are well known: e occurs about 12% of the time, a is 8% of the time, o is 7.5% of the time, etc.
If you give me a long piece of text and I see that 'H' occurs 12% of the time, It's a fair bet that your cipher replaced E with H, and so F with I, G with J, and so on, and I can decrypt your message.
Another possibility is just to try all possible Caesar ciphers. There are only 26 of them, and so it's not that difficult, given a piece of ciphertext, to look at all possible plaintexts it could decode to. If 25 of these are jibberish and one is an English sentence, it's a good bet that English sentence is the intended plaintext.

1.2 Vigenére ciphers

Vigenére ciphers are basically several Caesar ciphers put together.
Pick a secret word, say HAMLET. For each letter of your message, write the letters of HAMLET directly beneath, repeating as necessary
```
TO SLEEP PERCHANCE TO DREAM
HA MLETH AMLETHAML ET HAMLE
```
Now encrypt each letter using the Caesar cipher determined by the letter of HAMLET directly below that letter. So, here we'd encode T by the Caesar cipher sending A -> H, we encode O by the Caesar cipher sending A -> A, we encode S by the Caesar cipher sending A -> M, and so on. This results in
```
AO EWIXW PQCGAHNOP XH KRQLQ
```
Though a bit more complicated than the regular Caesar cipher, people have known how to break Vigenére ciphers since the 1850s. If you have a long enough string of text, then by chance some letters will be encoded the same way multiple times.
```
BLAHBLEEPBLOOPBLAHBLEEPBLOOPBLAHBLEEPBLOOPBLAHBLEEPBLOOP
HAMLETHAMLETHAMLETHAMLETHAMLETHAMLETHAMLETHAMLETHAMLETHA
**                **                      **
```
This means in the ciphertext, the same patters of letters appear over and over.
The distance between these repetitions is a multiple of key length. This gives us a clue how long the key is. If we have a long enough piece of ciphertext, then we can figure out the key is one of only a couple possible lengths.
Once we know how long the key is, we can cut our ciphertext into pieces where all of the characters in each piece are encoded in the same way. At this point we're just dealing with Caesar ciphers, and can use frequency analysis to break the cipher.

1.3 One-time pads

There is one way to modify a Vigenére cipher to get a "perfect" encryption method.
Use a randomly generated key for the Vigenere cipher, but make the key have the same length as the ciphertext. If your key is truly random, then this encryption is impossible to break because every key is just as likely as every other key, and so all possible decipherings of the ciphertext are equally likely. Someone trying to decrypt your message has to consider all possible keys, but (if everything is really random), all possible messages can occur for some key.

2 RSA

2.1 Public key cryptography

All of the methods described above have a serious flaw: they require that the people sending and receiving the messages both known the agreed-upon key. In practice this is not really a reasonable thing to expect. If I want to send messages across the Internet to someone in China, we both have to agree upon a key, but all of our communications discussing this key could be infiltrated.
One way around this issue, which makes cryptography much more reasonable is to use a pair of keys. One key is used for encrypting messages and is told to the entire world (the public key), and one key is used for decrypting messages and is kept secret (the private key).
The idea is that anyone and everyone could look up your public key and use that to encrypt a message they want to send you.
However, decrypting any message requires the private key, which only I know. So even if you see the encrypted version of a message someone else sends me, you won't be able to decrypt it if you don't have my private key.
The public key cryptosystem we'll describe is known as RSA (the letters are the initials of the three guys who created the cryptosystem in the late 70's).
Our private key will be a pair of (very large) prime numbers, p and q
The public key will be two numbers, m = p*q, and k which is some number relatively prime to phi(m). (Note we tell people the number m which is the product of the two primes, and NOT the actual primes. The main thing is that if p and q are very large numbers, it's very difficult for someone to factor m and determine what p and q were.)

2.2 Encryption with RSA

You know someone's public key (k, m) and want to send them a message.
Convert your message into a string of digits, maybe replace A with 1, B with 2, C with 3, and so on.
Now cut this string up into segments where each segment has <= #digits in m.
So for each segment we have a number.
Compute that number raised to the k^th power modulo m.
These values comprise our encrypted message.

2.3 Decryption with RSA

Someone sends us a message, so a sequence of numbers, encrypted using our public key, (k, m).
We can compute the k-th roots of our message fairly easily: Find the u, v solving ku - φ(m) v = 1, then do each number raised to the u-th power modulo m.
Notice that solving this equation requires we know φ(m). We know phi(m) = phi(pq) = phi(p) phi(q) = (p-1) (q-1). However, someone who only knows m (and not the primes p and q) will first want to factor m, and this is computationally infeasible provided m is a large enough number.

2.4 Example

Consider an example where our private key is p = 991, q = 997.
Let's use the public key k = 13, m = 988027
If someone wants to send us the message
```
TO BE OR NOT TO BE
```
using the public key, they convert the string into an array of integers:
```
[84, 79, 66, 69, 79, 82, 78, 79, 84, 84, 79, 66, 69]
```
Now we glue these into 'chunks' with <= 6 digits, since 988027 has 6 digits:
```
[847966, 697982, 787984, 847966, 69]
```
For each of these value in this array we compute the number raised to the 13th power, modulo 988027:
```
[784422, 109959, 469002, 784422, 819176]
```
Now glue these chunks together into one number, to get our encrypted message:
```
784422109959469002784422819176
```
To decrypt this message we just reverse the process. First we separate the ciphertext 784422109959469002784422819176 back into chunks of length <= 6:
```
[784422, 109959, 469002, 784422, 819176]
```
Now we want to determine the 13th root, modulo 988027, of each of these numbers. This is the step that is very difficult to do computationally. We know that to get the 13th root of each number, mod 988027, we can raise that number to the power of u and then reduce mod 988027, where u is a positive solution to the Diophantine equation

13u - φ(988027) v = 1

We already know 988027 = 991 * 997, so

φ(988027) = 990 * 996 = 986040.

However, someone that did not know the prime factorization 988027 = 991 * 997 would have to compute that prime factorization, and that could be very difficult if we choose very, very large primes.

Anyway, we compute that (u, v) = (303397, 4), and so we compute
```
784422^303397 mod 988027 = 847966
109959^303397 mod 988027 = 697982
469002^303397 mod 988027 = 787984
784422^303397 mod 988027 = 847966
819176^303397 mod 988027 = 69
```
This gives us back the blocks
```
[847966, 697982, 787984, 847966, 69]
```
Splitting this into 2-digit numbers gives
```
[84, 79, 66, 69, 79, 82, 78, 79, 84, 84, 79, 66, 69]
```
and finally converting these back to characters (using ASCII) gives
```
TO BE OR NOT TO BE
```

Math 4100 Lecture Notes on Cryptography

Table of Contents

1 Some old cryptographic techniques

1.1 Caesar ciphers

1.2 Vigenére ciphers

1.3 One-time pads

2 RSA

2.1 Public key cryptography

2.2 Encryption with RSA

2.3 Decryption with RSA

2.4 Example