If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Pseudorandom number generators

Random vs. Pseudorandom Number Generators. Created by Brit Cruise.

Want to join the conversation?

  • leaf blue style avatar for user Jackson
    In java programming if you are familiar, the random number generator built in with java generates a pseudo-random number. Why does Java not simply analyze one of the electrical ports on the computer and generate random numbers based on the static from that, like shown in the video? I feel like this would be much safer than simply using a pseudo-random number, because it would be easier to crack as a pseudo-random number as shown in the video, right?
    (30 votes)
    Default Khan Academy avatar avatar for user
    • mr pink red style avatar for user Varun
      The sensors for measuring noise are not built into the computer. The computer senses much larger differences in noise, not the minuscule waves that you see in ports. And if we did, it would take up alot more CPU space than deemed acceptable/
      (6 votes)
  • leaf green style avatar for user SteveSargentJr
    What's the difference between randomness and complexity?

    And, along the same lines, what makes the physical world random and not complex?
    (18 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user mitkel999
      I have the same question. Isn't the world also deterministic if we take account of ALL factors that influence a certain situation? Like, we can considere each molecule. We can predict the result of rolling a dice after throwing it and before it stops if we consider all factors (air circulation and resistance, whirls and so on, I'm not a physicist :P), can't we?
      (18 votes)
  • hopper cool style avatar for user Deven
    When you measure the "Randomness" of a pile of leaves and get a number, are you not getting randomness, but the sum of everything that happened to each of the leaves until they got to that point?
    (12 votes)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user aftoeinaitoemailmou
    If Eve knows the seed size and how many numbers constitute "the middle" of its square, couldn't she be able to find the initial seed (and thus the whole pseudorandom sequence) just by intercepting somehow the beginning of the pseudorandom sequence?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • male robot hal style avatar for user ben m
    So what if you used a single large prime number? Would that increase the length of time it takes to see a repeating pattern?
    (4 votes)
    Default Khan Academy avatar avatar for user
    • mr pink red style avatar for user Varun
      No, a prime would be the same as any other large number, because there is no need for multiplying to the seed. You only need to get the middle of the seed. No, I don't think it would, unless the middle of your number was a prime, then it might help a little bit.
      (4 votes)
  • mr pink red style avatar for user Nishant Gupta
    At , how did the narrator calculate the reduced key space to be 10k different outcomes only given it was generated from a 4 digit seed?
    (4 votes)
    Default Khan Academy avatar avatar for user
  • leafers tree style avatar for user ウミスズメ
    I was wondering if 10 sided dice numbered 0-9 rolled multiple times to generate the seed would work well? From what I understand 10 sided dice are a pentagonal trapezohedron with 2 sharp corners at the end. The odd numbers are usually labeled on one corner and the even labeled on the other.
    (3 votes)
    Default Khan Academy avatar avatar for user
    • orange juice squid orange style avatar for user Troy Cook
      That would generate a number from 0 to 9,999,999,999. Depending on how many random numbers you want to generate and how frequently, this would come close to giving you a sequence of random numbers. Because you are computing the next random number from the last number, you would eventually repeat the sequence. Just as in the video, when you speed up the computations and plot them then a pattern emerges.
      (6 votes)
  • piceratops ultimate style avatar for user JerChri13
    what if you used an eternal number like Pi?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • piceratops ultimate style avatar for user BreanSlayer
    Interesting discovery Alexander Strzalkowski, (look in comments) Can anyone explain it? Is it just a flaw?
    (3 votes)
    Default Khan Academy avatar avatar for user
  • piceratops ultimate style avatar for user AegonTargaryen
    This is cool, I am the first one to ask a question on this video

    Why can't someone just measure TV static in order to generate a random number?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • male robot hal style avatar for user Benjamin Cuningham
      You can. However, as stahl.ej already mentioned, it would not be very practical.

      The main reason for generating a long series of random or pseudorandom numbers is to encrypt data. You need a long series to avoid repetition which makes the encryption easier to break.

      The difficulty is if you need to encrypt some data and send it to someone else in a way that will put it at risk of eavesdropping. If you generate a truly random series of numbers to use as the encryption key, then you need to send the entire series to you recipient. Also, you cannot simply send it as is or the eavesdropper with see both the encrypted data and the encryption key and be able to decrypt the data. This is why methods like public key encryption are used.

      In public key encryption with the use of pseudorandom numbers, the relatively short seed can be securely shared (see the video on public key encryption) and then used to generate the exact same very long encryption key on both sides of the secure conversation.

      So, you could generate a truly random encryption key from TV static, but it would have to be shared ahead of time.
      (5 votes)

Video transcript

(fun music) (leaves rustling) - [Voiceover] When we observe the physical world we find random fluctuations everywhere. We can generate truly random numbers by measuring random fluctuations, known as noise. When we measure this noise, known as sampling, we can obtain numbers. - [Voiceover] One, two, three, four-- - [Voiceover] For example, if we measure the electric current of TV static over time, we will generate a truly random sequence. We can visualize this random sequence by drawing a path that changes direction according to each number, known as a random walk. Notice the lack of pattern at all scales. At each point in the sequence the next move is always unpredictable. Random processes are said to be nondeterministic, since they are impossible to determine in advance. Machines, on the other hand, are deterministic. Their operation is predictable and repeatable. In 1946, John von Neumann was involved in running computations for the military; specifically involved in the design of the hydrogen bomb. Using a computer called the ENIAC, he planned to repeatedly calculate approximations of the processes involved in nuclear fission. However, this required quick access to randomly generated numbers that could be repeated, if needed. However, the ENIAC had very limited internal memory; storing long random sequences was not possible. So, Neumann developed an algorithm to mechanically simulate the scrambling aspect of randomness as follows: First, select a truly random number, called the "seed". This number could come from the measurement of noise, or the current time in milliseconds. Next, this seed is provided as input to a simple calculation. Multiply the seed by itself, and then output the middle of this result. Then you use this output as the next seed, and repeat the process as many times as needed. This is known as the middle-squares method and is just the first in a long line of pseudorandom number generators. The randomness of the sequence is dependent on the randomness of the initial seed only. Same seed, same sequence. So, what are the differences between a randomly generated versus pseudorandomly generated sequence? Let's represent each sequence as a random walk. They seem similar until we speed things up. The pseudorandom sequence must eventually repeat. This occurs when the algorithm reaches a seed it has previously used, and the cycle repeats. The length, before a pseudorandom sequence repeats, is called "the period". The period is strictly limited by the length of the initial seed. For example, if we use a two-digit seed, then an algorithm can produce, at most, 100 numbers, before reusing a seed and repeating the cycle. A three-digit seed can't expand past 1,000 numbers before repeating its cycle, and a four-digit seed can't expand past 10,000 numbers before repeating. Though if we use a seed large enough, the sequence can expand into trillions and trillions of digits before repeating. Though the key difference is important. When you generate numbers pseudorandomly, there are many sequences which cannot occur. For example, if Alice generates a truly random sequence of 20 shifts, it's equivalent to a uniform selection from the stack of all possible sequences of shifts. This stack contains 26 to the power of 20 pages, which is astronomical in size. If we stood at the bottom and shined a light upwards, a person at the top would not see the light for around 200,000,000 years. Compare this to Alice generating a 20 digit pseudorandom sequence, using a four-digit random seed. Now, this is equivalent to a uniform selection from 10,000 possible initial seeds, meaning she can only generate 10,000 different sequences, which is a vanishingly small fraction of all possible sequences. When we move from random to pseudorandom shifts, we shrink the key space into a much, much smaller seed-space. So, for a pseudorandom sequence to be indistinguishable from a randomly generated sequence, it must be impractical for a computer to try all seeds and look for a match. This leads to an important distinction in computer science, between what is possible, versus what is possible in a reasonable amount of time. We use the same logic when we buy a bike lock. We know that anybody can simply try all possible combinations, until they find a match and it opens. But it would take them days to do so. So, for eight hours we assume it's practically safe. With pseudorandom generators, the security increases as the length of the seed increases. If the most powerful computer would take hundreds of years to run through all seeds, then we safely can assume it's practically secure, instead of perfectly secure. As computers get faster the seed size must increase accordingly. Pseudorandomness frees Alice and Bob from having to share their entire random shift sequence in advance. Instead, they share a relatively short random seed, and expand it into the same random-looking sequence when needed. But what happens if they can never meet to share this random seed?