Main content
Computers and the Internet
Course: Computers and the Internet > Unit 4
Lesson 5: Data encryption techniquesEncryption, decryption, and cracking
One of the earliest encryption techniques is the Caesar Cipher, invented by Julius Caesar more than two thousand years ago to communicate messages to his allies.
The Caesar Cipher is a great introduction to encryption, decryption, and code cracking, thanks to its simplicity.
Encrypting a message
Imagine Caesar wants to send this message:
SECRET MEETING AT THE PALACE
Here's what that might look like encrypted:
YKIXKZ SKKZOTM GZ ZNK VGRGIK
That looks an awfully lot like gobbledygook at first, but this encrypted message is actually very related to the original text.
The Caesar Cipher is a simple substitution cipher which replaces each original letter with a different letter in the alphabet by shifting the alphabet by a certain amount.
To make the encrypted message above, I shifted the alphabet by 6 and used this substitution table:
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C | D | E | F |
S shifts 6 letters over to Y, E shifts 6 letters over to K, etc. Here's the first word and its shifts:
S | E | C | R | E | T |
---|---|---|---|---|---|
Y | K | I | X | K | Z |
Decrypting a message
According to historical records, Caesar always used a shift of 3. As long as his message recipient knew the shift amount, it was trivial for them to decode the message.
Imagine Caesar sends this message to a comrade:
EHZDUH EUXWXV
The comrade uses this substitution table, where the alphabet is shifted by 3:
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | A | B | C |
They can then decode the message with certainty. The first letter "E" was shifted by 3 from "B", the second letter "H" was shifted by 3 from "E", etc. The result is this ominous message:
BEWARE BRUTUS
Cracking the cipher
Imagine that a very literate and savvy enemy intercepts one of Caesar's messages.
RZ VMZ WMDIBDIB VGG AJMXZN OJ EJDI RDOC XGZJKVOMV OJ YZAZVO OCZ ZIZHT LPZZI VO OCZ IDGZ YZGOV
That enemy does not know that Caesar always uses a shift of 3, so he must attempt to "crack" the cipher without knowing the shift.
There are three main techniques he could use: frequency analysis, known plaintext, and brute force.
Frequency analysis
Human languages tend to use some letters more than others. For example, "E" is the most popular letter in the English language. We can analyze the frequency of the characters in the message and identify the most likely "E" and narrow down the possible shift amounts based on that.
Try it out yourself! Paste the message in the text area below and analyze the frequency graph to identify a possible "E":
Known plaintext
Another term for the original unencrypted message is plaintext. If the enemy already knew some part of the plaintext, it will be easier for them to crack the rest of the encrypted version.
For example, messages tend to start with similar beginnings. In WWII, encrypted German messages always started with a weather forecast, which ultimately made them easier for British mathematician Alan Turing to crack.
Do you think Julius started this message in a common way?
Brute force
There are only 25 possible shifts (not 26 — why not?). The enemy could take some time to try out each of them and find one that yielded a sensible message. They wouldn't even need to try the shifts on the entire message, just the first word or two.
Try it yourself below:
Caesar's enemy wouldn't have a computer to help them, but it likely would take them less than an hour if they understood the idea of the Caesar Cipher.
Have you managed to crack the code and decrypt the message?
Click below for the big reveal!
Encryption, decryption, and cracking
Thanks to this exploration of the Caesar Cipher, we now understand the three key aspects of data encryption:
- Encryption: scrambling the data according to a secret key (in this case, the alphabet shift).
- Decryption: recovering the original data from scrambled data by using the secret key.
- Code cracking: uncovering the original data without knowing the secret, by using a variety of clever techniques.
Whenever we consider a possible encryption technique, we need to think about all those aspects: how easy is it to encrypt? how easy is it to decrypt? And most importantly, how easy is it for a nefarious individual to crack the code?
We can no longer use the Caesar Cipher to secure our data, as it is far too easy to crack, but understanding the Cipher prepares us for understanding modern encryption techniques.
If you'd like, you can dive deeper into the Caesar Cipher in our Khan Academy tutorial on Ancient Cryptography.
🙋🏽🙋🏻♀️🙋🏿♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!
Want to join the conversation?
- Jg zpv dbo gjhvsf uijt dpef pvu uifo zpv hfu up csbh(3 votes)
- "if you figure this code out then you get to brag"
Thanks:D(6 votes)
- Can you post the source codes for Frequency analysis and Brute force?(5 votes)
- Hey, Davos. You could use Google in the following manor: "frequency analysis" or "ciphertext brute force", followed by your preferred language, like Python.(1 vote)
- Z ezs qzs vzr gdqd (shifted by 25)(2 votes)
- Fjclq xdc oxa cqn qdwpah ljc(1 vote)
- I need a web proxy link.(2 votes)
- Is plaintext the words or language that people understand prior to encryption? Is it true to say that if a language remains unwritten, then it would be more difficult to decrypt? Are there any encryptions that are near impossible to crack? Does this mean that more complex encryptions are constantly being created by machines?(1 vote)
- 1. Yes, that is correct.
2. If a language is never written down, what are you writing the encrypted message in?
3. Yes, the One-Time Pad is an encryption that is impossible to crack. However, it has other limitations which is why it is not used in practice. Generally, right now, we rely on AES and RSA which have not yet been broken.
4. Not really. Creating effective encryptions requires a very good understanding of mathematics. So, it is up to humans to create encryptions.(1 vote)
- What are viruses(0 votes)
- I'm assuming you mean a computer Virus. A Virus is an entity that is on your computer, they can do many things. Like some can steal or destroy software. Destroy files, or even make tons of pop-ups. They can do more.(4 votes)
- What is a frequency(0 votes)
- Frequency is the number of times that something occurs.
In the sentence "the quick brown fox jumped over the lazy dog", the frequency of the letter "q" is 1.
I hope this helps! =](3 votes)
- my question is why would you give us a "try it yourself" thing all your doing is teaching kids how to decrypt messages illegally so I recommend that you be careful what you let people try... sorry that i had to say this but i want kids to grow up and be responsible and safe =(
=[(0 votes)- Kids learn better with hands-on experience, so giving them a "try it yourself" helps them better understand breaking encryptions. Also, the Caesar cipher is no longer used for anything important, so teaching kids how to break it poses no threat. Even if a kid decided to illegally break encryptions, they wouldn't be able to actually break any that are currently being used.(7 votes)