Main content

## Computer science

### Course: Computer science > Unit 2

Lesson 1: Ancient cryptography- What is cryptography?
- The Caesar cipher
- Caesar Cipher Exploration
- Frequency Fingerprint Exploration
- Polyalphabetic cipher
- Polyalphabetic Exploration
- The one-time pad
- Perfect Secrecy Exploration
- Frequency stability property short film
- How uniform are you?
- The Enigma encryption machine
- Perfect secrecy
- Pseudorandom number generators
- Random Walk Exploration

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Pseudorandom number generators

Random vs. Pseudorandom Number Generators. Created by Brit Cruise.

## Want to join the conversation?

- In java programming if you are familiar, the random number generator built in with java generates a pseudo-random number. Why does Java not simply analyze one of the electrical ports on the computer and generate random numbers based on the static from that, like shown in the video? I feel like this would be much safer than simply using a pseudo-random number, because it would be easier to crack as a pseudo-random number as shown in the video, right?(30 votes)
- The sensors for measuring noise are not built into the computer. The computer senses much larger differences in noise, not the minuscule waves that you see in ports. And if we did, it would take up alot more CPU space than deemed acceptable/(5 votes)

- What's the difference between
**randomness**and**complexity**?

And, along the same lines, what makes the physical world**random**and not**complex**?(18 votes)- I have the same question. Isn't the world also deterministic if we take account of ALL factors that influence a certain situation? Like, we can considere each molecule. We can predict the result of rolling a dice after throwing it and before it stops if we consider all factors (air circulation and resistance, whirls and so on, I'm not a physicist :P), can't we?(17 votes)

- When you measure the "Randomness" of a pile of leaves and get a number, are you not getting randomness, but the sum of everything that happened to each of the leaves until they got to that point?(12 votes)
- Yes, things happened to the leaves to get them there, but we have no idea what happened to get them there so from our point of view it is random.(11 votes)

- If Eve knows the seed size and how many numbers constitute "the middle" of its square, couldn't she be able to find the initial seed (and thus the whole pseudorandom sequence) just by intercepting somehow the beginning of the pseudorandom sequence?(5 votes)
- From the author:Yes, if the seed size is small then many attacks such as these are easy. However if the seed size is 256 bits long, then there would be a massive number (on the scale of the size of the universe) to check.(16 votes)

- So what if you used a single large prime number? Would that increase the length of time it takes to see a repeating pattern?(4 votes)
- No, a prime would be the same as any other large number, because there is no need for multiplying to the seed. You only need to get the middle of the seed. No, I don't think it would, unless the middle of your number was a prime, then it might help a little bit.(4 votes)

- At4:52, how did the narrator calculate the reduced key space to be 10k different outcomes only given it was generated from a 4 digit seed?(4 votes)
- Each diget has 10 posibilities. 10x10x10x10=10000(1 vote)

- I was wondering if 10 sided dice numbered 0-9 rolled multiple times to generate the seed would work well? From what I understand 10 sided dice are a pentagonal trapezohedron with 2 sharp corners at the end. The odd numbers are usually labeled on one corner and the even labeled on the other.(3 votes)
- That would generate a number from 0 to 9,999,999,999. Depending on how many random numbers you want to generate and how frequently, this would come close to giving you a sequence of random numbers. Because you are computing the next random number from the last number, you would eventually repeat the sequence. Just as in the video, when you speed up the computations and plot them then a pattern emerges.(6 votes)

- what if you used an eternal number like Pi?(3 votes)
- That might work, except the problem is, you can't really square pi and have the computer remember infinite digits.(5 votes)

- Interesting discovery Alexander Strzalkowski, (look in comments) Can anyone explain it? Is it just a flaw?(3 votes)
- From the author:If you look closely, it actually does move but it has a very short period and loops around the origin back and fourth in a straight line. The red line quickly overlaps it so it's difficult to see this.(4 votes)

- This is cool, I am the first one to ask a question on this video

Why can't someone just measure TV static in order to generate a random number?(3 votes)- You can. However, as stahl.ej already mentioned, it would not be very practical.

The main reason for generating a long series of random or pseudorandom numbers is to encrypt data. You need a long series to avoid repetition which makes the encryption easier to break.

The difficulty is if you need to encrypt some data and send it to someone else in a way that will put it at risk of eavesdropping. If you generate a truly random series of numbers to use as the encryption key, then you need to send the entire series to you recipient. Also, you cannot simply send it as is or the eavesdropper with see both the encrypted data and the encryption key and be able to decrypt the data. This is why methods like public key encryption are used.

In public key encryption with the use of pseudorandom numbers, the relatively short seed can be securely shared (see the video on public key encryption) and then used to generate the exact same very long encryption key on both sides of the secure conversation.

So, you could generate a truly random encryption key from TV static, but it would have to be shared ahead of time.(5 votes)

## Video transcript

(fun music) (leaves rustling) - [Voiceover] When we
observe the physical world we find random fluctuations everywhere. We can generate truly random numbers by measuring random
fluctuations, known as noise. When we measure this
noise, known as sampling, we can obtain numbers. - [Voiceover] One, two, three, four-- - [Voiceover] For example, if we measure the electric current
of TV static over time, we will generate a truly random sequence. We can visualize this random sequence by drawing a path that changes direction according to each number,
known as a random walk. Notice the lack of pattern at all scales. At each point in the sequence the next move is always unpredictable. Random processes are said
to be nondeterministic, since they are impossible
to determine in advance. Machines, on the other
hand, are deterministic. Their operation is
predictable and repeatable. In 1946, John von Neumann was involved in running computations for the military; specifically involved in the
design of the hydrogen bomb. Using a computer called the ENIAC, he planned to repeatedly
calculate approximations of the processes involved
in nuclear fission. However, this required quick access to randomly generated numbers that could be repeated, if needed. However, the ENIAC had very
limited internal memory; storing long random
sequences was not possible. So, Neumann developed an algorithm to mechanically simulate
the scrambling aspect of randomness as follows: First, select a truly random
number, called the "seed". This number could come from
the measurement of noise, or the current time in milliseconds. Next, this seed is provided as input to a simple calculation. Multiply the seed by itself, and then output the middle of this result. Then you use this output as the next seed, and repeat the process
as many times as needed. This is known as the middle-squares method and is just the first in a long line of pseudorandom number generators. The randomness of the
sequence is dependent on the randomness of the initial seed only. Same seed, same sequence. So, what are the differences between a randomly generated versus pseudorandomly generated sequence? Let's represent each
sequence as a random walk. They seem similar until
we speed things up. The pseudorandom sequence
must eventually repeat. This occurs when the
algorithm reaches a seed it has previously used,
and the cycle repeats. The length, before a
pseudorandom sequence repeats, is called "the period". The period is strictly limited by the length of the initial seed. For example, if we use a two-digit seed, then an algorithm can produce, at most, 100 numbers, before reusing a seed and repeating the cycle. A three-digit seed can't
expand past 1,000 numbers before repeating its cycle, and a four-digit seed can't expand past 10,000 numbers before repeating. Though if we use a seed large enough, the sequence can expand into trillions and trillions
of digits before repeating. Though the key difference is important. When you generate numbers pseudorandomly, there are many sequences
which cannot occur. For example, if Alice generates
a truly random sequence of 20 shifts, it's equivalent
to a uniform selection from the stack of all
possible sequences of shifts. This stack contains 26
to the power of 20 pages, which is astronomical in size. If we stood at the bottom
and shined a light upwards, a person at the top
would not see the light for around 200,000,000 years. Compare this to Alice generating a 20 digit pseudorandom sequence, using a four-digit random seed. Now, this is equivalent
to a uniform selection from 10,000 possible initial seeds, meaning she can only generate 10,000 different sequences, which is a vanishingly small fraction of all possible sequences. When we move from random
to pseudorandom shifts, we shrink the key space into a much, much smaller seed-space. So, for a pseudorandom sequence
to be indistinguishable from a randomly generated sequence, it must be impractical for a computer to try all seeds and look for a match. This leads to an important distinction in computer science,
between what is possible, versus what is possible in
a reasonable amount of time. We use the same logic
when we buy a bike lock. We know that anybody can simply try all possible combinations, until they find a match and it opens. But it would take them days to do so. So, for eight hours we
assume it's practically safe. With pseudorandom generators, the security increases as the
length of the seed increases. If the most powerful computer would take hundreds of years to
run through all seeds, then we safely can assume
it's practically secure, instead of perfectly secure. As computers get faster the seed size must increase accordingly. Pseudorandomness frees Alice and Bob from having to share their entire random shift sequence in advance. Instead, they share a
relatively short random seed, and expand it into the same random-looking sequence when needed. But what happens if they can never meet to share this random seed?