Cryptography Basics🔒
So, you’ve decided that you want to learn cybersecurity. What better place to start than with cryptography - an entire field dedicated to secure communication!
Now, hold on a minute. What does “secure” really mean? And how do we make things “secure”? Let’s get started!
Two thousand years ago, in the great Roman Empire, there lived a man named Julius Caesar. He wasn’t just an ordinary man – he served as the Roman Dictator and is regarded as one of the greatest military commanders in history.
Julius Caesar also invented one of the oldest and simplest forms of encryption - the Caeser cipher. Let’s see it in action.
Suppose I want to send you an important message, “Attack at Dawn.” However, I’m worried that someone might intercept the messenger and read the secret message.
What I’ll do is replace every letter in the message with a different letter that’s some number of positions down the alphabet. For example, if the shift is 3 letters to the right, then the word “AB” becomes “DE”. If the letter goes past “Z” when shifted, then it wraps back to the beginning of the alphabet. For example, “Y” shifted by 3 will become “B”.
What would the word “DOG” become?
Try again! When the letter “D” is shifted down 3 spots, it becomes “E” -> “F” -> “G”.
Now, as long as you also know the secret shift that I used to encrypt the message, then you can reverse the steps to recover the original message.
If you receive the encrypted word “MJQQT” and know that the shift is 5, what would the original word be?
Try again! When M is un-shifted by 5, we arrive at H.
In cryptography terms, the original message is called the plaintext and the encrypted message is called the ciphertext.
In our case, the shift acts as the key – it’s a secret value known only to the people exchanging messages and not to anyone else. This way, only the people the message was intended for can decrypt it and read the contents.
However, due to the simplicity of the Caeser Cipher, anyone can decrypt an encrypted message without knowing the secret value. Do you see how?
How many possible secret keys (shifts) are there? Would it be feasible to try all of them?
Notice that there are only 26 different possible keys (i.e. shifts) and that you can try all of them until you get a correct message. This technique is known as brute-forcing: trying all possible keys in the hope of getting a message that looks right.
Take the last encrypted message, “MJQQT,” as an example. Let’s say that we’ve intercepted this message, but we don’t know the secret shift. If we try a few shifts, we get the values:
- LIPPS (shift = 1)
- HELLO (shift=5)
- CZGGJ (shift=10)
- SPWWZ (shift=20)
It’s quite easy to deduce which shift is correct - all of the other ones produce garbled characters.
Of course, this cipher isn’t very advanced. By simply trying all 26 possibilities, we can recover the original message.
Substitution Cipher
Now, we’ll look at a more complicated encryption scheme: the substitution cipher.
Similar to the Caeser cipher, substitution ciphers encrypt messages by substituting each letter in the message with a different one, according to some alphabet.
For example, imagine we have the mapping:
A -> D
B -> C
C -> G
Then, the message ABC
would be encrypted to the message DCG
.
What would the message CCAB
become?
In reality, the mapping would be from 26 characters of the alphabet to some different 26 characters. Unlike the Caeser Cipher, this is far more difficult to brute-force by trying out all possible mappings!
In fact, there are over 4 * 10^26
– that’s more than 400 trillion, trillion possible mappings!
Attacking the Substitution Cipher
From our first impressions, the substitution cipher seems far more secure than the flimsy Caeser cipher – there’s no way that someone can try out all possible mappings, and even then, how would they know which mapping is the correct one?
Well, it turns out that there’s one crucial technique for cracking this cipher: frequency analysis.
In the English alphabet, not all characters are distributed evenly. For example, which character appears more commonly?
As you can see, vowels are generally more likely to appear than consonants. So, one simple but effective strategy is to guess that the most common occurring characters are vowels.
Yes! But, we can extend this strategy even further: the most common characters in English are ‘etaoinshrdlu’ in this order – there’s even a Wikipedia article about this phrase.
So, we can map the most common character to e, the second most common character to t, and so on.
Yes, clearly this won’t always work! What happens if the text contains more t
’s than e
’s, like in the book Gadsby which doesn’t use the letter e
a single time?
In most cases, this frequency analysis can get us pretty far. For example, if you see a three-letter word th?
, the only possibility for the last letter is e
. Usually, the actual text will only be slightly different from the letter frequencies of the overall English language.
Instead of only looking at individual letters, we can also look at combinations of letters – bigrams and trigrams!
Which sequence of two letters do you think is more common?
Which sequence of three letters do you think is more common?
Using additional information like the most common English bigrams and trigrams, we can narrow down the possible letter combinations!
For example, the following phrase:
Pwblxehqx br oehqcgx oeh dg csvwpgq rzbt e xshvcg oehqcg, ehq pwg csrg br pwg oehqcg kscc hbp dg xwbzpghgq. Weffshgxx hgagz qgozgexgx di dgshv xwezgq. Dlqqwe.
can be decrypted almost instantly on the website quipqiup, which uses statistical information to deduce the substitution alphabet.
You’ll use these techniques in the challenge exercise for this chapter – the only difference is that you’ll be working with three alternating alphabets, instead of only one!
Modern Ciphers
As technology and new cryptanalysis techniques grew increasingly advanced, these ciphers became easier to decrypt
Symmetric Ciphers
Summary
Next in Cybersecurity:
Symmetric Encryption 🔑
Let’s pretend you and I are undercover agents tasked with an important mission. We’ll have to communicate using a secret language in case anyone overhears our conversation.
“Sure, that sounds cool” you might say. “But how is that possible? What happens if someone understands our language?” Well, you’ll find out in this chapter!