If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

## Internet safety

### Course: Internet safety>Unit 1

Lesson 10: Data encryption techniques

# The Caesar cipher

The Caesar Cipher, used by Julius Caesar around 58 BC, is a substitution cipher that shifts letters in a message to make it unreadable if intercepted. To decrypt, the receiver reverses the shift. Arab mathematician Al-Kindi broke the Caesar Cipher using frequency analysis, which exploits patterns in letter frequencies. Created by Brit Cruise.

## Want to join the conversation?

• why would Caesar use ciphers?
(28 votes)
• Caesar used ciphers so that important information, such as the location of a attack or the date it would be carried out, would be unknown to enemies but know to the rest of his troop. If his messages were ever intercepted, the enemy would't immediately understand what the cipher meant.
(23 votes)
• Decoding the Caesar Cipher based on the "fingerprint" requires a large sample space . I mean lets say if the message contains few words or only single word even that frequency distribution wont help in that case ..
(26 votes)
• To the original question, yes, shorter messages make it harder to detect the frequency distribution, but you'd be surprised how quickly it shows up.
To Skylear's comment: A Caesar Cipher does have a sample space. The random variable is the number used for the shift. In your example, you encoded JASON IS BLUE using a shift of 2, but 2 could have been 1 or 23 or 14. In fact, it could have been any number from 1 to 26. So the sample space has 26 possibilities (there are 26 different ways to apply a caesar's cipher to the message).
(13 votes)
• At MEET is encrypted as PHHN. M is shifted "ahead" 3 letters to P, E is shifted "ahead" 3 letters to H (twice over), but then T is encrypted by shifting to the letter that PRECEDES it (by three)... why?
(5 votes)
• Because it's wrong; the 't' in "at" or "elephant" should be the same letter but it's W. The sequence should be 'p', 'h', 'h', 'w', 'p', 'h', 'd', 'w', 'h', 'o', 'h', 's', 'k', 'd', 'q', 'w', 'o', 'd', 'n', 'h'
(3 votes)
• Do we still use written ciphers today?
(2 votes)
• Now a days written ciphers are almost only used by hobbyists. Computers have made using much better modern ciphers, much easier (to the point that most people don't even realize when they are being used) and more secure.
(6 votes)
• Does anyone still use the Ceasar cipher?
(2 votes)
• Outside of simple code-breaking puzzles and encoded messages sent by children, not really. A human could easily brute-force break the Ceasar cipher, so it would present no challenge at all to a computer. Modern encryption techniques (such as AES, RSA, ECC, and TwoFish) are far more complex and harder to break.
(5 votes)
• Does anybody know how to make a Caeser cipher using Python?
(2 votes)
• It's been a long time since I've used python but this should do the trick:

`def encrypt(messagetext,shift): ciphertext="" mlength=len(messagetext) for i in range(0,mlength): oldchar=ord(messagetext[i])-ord('a') newchar=(oldchar+shift)%26+ord('a') ciphertext += chr(newchar) return ciphertextdef decrypt(ciphertext,shift): messagetext="" clength=len(ciphertext) for i in range(0,clength): oldchar=ord(ciphertext[i])-ord('a') newchar=(oldchar-shift)%26+ord('a') messagetext += chr(newchar) return messagetextmymessage = "hello"myshift = 3mycipher = encrypt(mymessage,myshift)mydecryptedcipher = decrypt(mycipher,myshift)print "Original Message: ",mymessageprint "Cipher Text: ",mycipherprint "Decrypted Cipher: ",mydecryptedcipher`
Note that the functions assumes that the text is in all lowercase

Regardless of the language the technique is essentially the same, iterate through the string letter by letter and:
-convert the character to an ASCII number
-find the numerical difference from 'a'
-add in the shift (or subtract the shift to decrypt)
-mod 26 to keep it in the 'a' to 'z' range
-add back in 'a' to get the right ASCII value for a letter
-convert the new number to an ASCII letter

Hope this makes sense
(4 votes)
• How can one design a cipher for languages with logographic (writing systems with thousands of letters, such as Chinese or hieroglyphics) such as these, considering that ciphers that rely on an ordered list of all letters may not be possible (such an ordering doesn't exist) or practical (the lookup table will be thousands of letters long, thus making the code not practical to decrypt by the receiver)?
(2 votes)
• Approach 1:
1) convert symbols to their phonetic equivalents e.g. for Japanese you could write the kanji symbols as hiragana (you see this a lot in old Japanese video games)
2) usually the number of phonetic chunks is < 100, so any look up table is manageable

Approach 2:
1) create a dictionary of all the symbols you want to use and then assign each one a number (this is basically how a computer displays kanji)
2) Apply a numeric operation to encrypt, and its reverse to decrypt e.g. a shift like a Caesar cipher, or something more complex
(2 votes)
• What if I used a shift of 46 or some other high number to encrypt my message ?
How would that be deciphered ?
(1 vote)
• All shifts will be a number from 0 to 25.
If one tries to use a shift greater than 25 it will wrap around back to 0 again.
If you use a shift X, the equivalent shift, from 0 to 25, can be calculated as:
X mod 26
or equivalently
the remainder when you divide X by 26

Hope this makes sense
(3 votes)
• Would making a code language with only a few letters (6 for example) where each word had exactly one of each letter (so every word is 6 letters long). Wouldn't this erase the fingerprint (and make the brute force super easy)? Would it be a good idea for more secure ciphers?
(2 votes)
• That would erase the letter frequency fingerprint, which would make things tougher for a hacker. However, it doesn't completely defeat frequency analysis as we can look at the frequency of the words. Common words and word combinations will also produce a fingerprint.

A similar approach to trying to eliminate the letter frequency is to not use letters at all, and just use a code book where each word is represented by a number. A famous example of this is the Zimmerman Telegram: https://en.wikipedia.org/wiki/Zimmermann_Telegram
(1 vote)
• So normally when you make a code and say to your sender in a meeting the shift amount before you actually start sending the messages, what if the intercepter somehow was able to intercept that? Won't you have to disguise that? And what if the intercepter knew about that? Wouldn't all this cause you to go into a cycle of crypticity, disguising everything you do? So is it really possible to disguise it? How do you disguise it? Does it have anything to do with "Journey into cryptography?"
(2 votes)
• You are exactly right. If your communication line is being tapped, you can't say "let's use an encryption key of H" because then the listener will know the key. It makes encryption completely useless.

However, this only applies to symmetric key ciphers, which means the same key is used to encrypt and decrypt messages. There are asymmetric key ciphers, where you have two keys. A message encrypted with one key can only be decrypted with the other. So you keep one key a secret to yourself (the "private key") and the other you share with the world (the "public key"). If someone wants to send you a message, they just encrypt it with your public key and send it. Only you, with your private key, can decrypt it. You don't have to meet up before hand to share keys.

RSA is an example of an asymmetric key cipher.
(1 vote)

## Video transcript

SPEAKER 1: The first well known cipher, a substitution cipher, was used by Julius Caesar around 58 BC. It is now referred to as the Caesar Cipher. Caesar shifted each letter in his military commands in order to make them appear meaningless should the enemy intercept it. Imagine Alice and Bob decided to communicate using the Caesar Cipher First, they would need to agree in advance on a shift to use-- say, three. So to encrypt her message, Alice would need to apply a shift of three to each letter in her original message. So A becomes D, B becomes E, C becomes F, and so on. This unreadable, or encrypted message, is then sent to Bob openly. Then Bob simply subtracts the shift of three from each letter in order to read the original message. Incredibly, this basic cipher was used by military leaders for hundreds of years after Caesar. JULIUS CAESAR: I have fought and won. But I haven't conquered over man's spirit, which is indomitable. SPEAKER 1: However, a lock is only as strong as its weakest point. A lock breaker may look for mechanical flaws. Or failing that, extract information in order to narrow down the correct combination. The process of lock breaking and code breaking are very similar. The weakness of the Caesar Cipher was published 800 years later by an Arab mathematician named Al-Kindi. He broke the Caesar Cipher by using a clue based on an important property of the language a message is written in. If you scan text from any book and count the frequency of each letter, you will find a fairly consistent pattern. For example, these are the letter frequencies of English. This can be thought of as a fingerprint of English. We leave this fingerprint when we communicate without realizing it. This clue is one of the most valuable tools for a codebreaker. To break this cipher, they count up the frequencies of each letter in the encrypted text and check how far the fingerprint has shifted. For example, if H is the most popular letter in the encrypted message instead of E, then the shift was likely three. So they reverse the shift in order to reveal the original message. This is called frequency analysis, and it was a blow to the security of the Caesar cipher.