If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Summary (what's next?)

Why is factorization hard, yet generating primes easy? Where do we go from here? Created by Brit Cruise.

Want to join the conversation?

  • leaf blue style avatar for user Jóhann Bernhard Jóhannsson
    does it take the same amount of time for a computer to calculate 7+4 as to 700+400?
    if not how does it differ?
    (17 votes)
    Default Khan Academy avatar avatar for user
    • leaf green style avatar for user Emil Fihlman
      Yes it takes. One fundamental reason to this would be that you can express 7 and 4 with 3 bits but you need 10 bits for 700 and 9 bits for 400.
      This means a longer stream of bits must pass through transistors to make the calculation.
      This difference is insignificant, like really insignificant.
      (3 votes)
  • male robot hal style avatar for user Brian Liu
    cant people find a prime by first finding another prime,n, and then finding (2^n)-1?
    (0 votes)
    Default Khan Academy avatar avatar for user
  • purple pi purple style avatar for user Ryan Christopher
    Instead of defining the step size by using prime numbers, can we use other well-known sequences that may also work, such as Fibonacci numbers or any sequence that meets Brown's Criterion?
    (5 votes)
    Default Khan Academy avatar avatar for user
  • aqualine ultimate style avatar for user Kent Green
    A big thing in the news right now is that NASA, Google and others are buying some of the world's first Quantum computers, which process data in Qubits instead of traditional bits. According to Wikipedia, a Qubit is able to be in either of the classic binary positions or a superposition of both; which I've (perhaps incorrectly) imagined to be a method of calculation involving levels of certainty rather then absolute values. Any ideas how this new technology might effect problems like finding Primes and Prime Factorization?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • primosaur sapling style avatar for user EMC
    What if you used the sieve of Eratosthenes WHILE generating primes? You would mark all numbers that are divisible up to a certain point, then when it comes to the "certain point", just act as if the number were prime and increase the point. Or, store the primes and do a burst of checks. I will make a program to see if it works.
    (2 votes)
    Default Khan Academy avatar avatar for user
  • primosaur seedling style avatar for user Gwendo1r
    What was the largest prime number ever found?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user Ben
    So... was thinking about this problem a bit. I came up with something slightly more efficient I think... but it still wasn't sufficient for the really large numbers. First few steps involved doing a mod 5 and then a mod 2. The mod 2 was to determine that it was an odd number and the mod 5 was used to remove all numbers ending in 5. This really only helps cut out numbers if they are not prime from being checked more than necessary. This would remove all numbers ending in 0, 2, 4, 5, 6, and 8. Preventing 60% of all numbers being checked more than necessary. Only a slight improvement over only checking odds. Also, this actually decreases performance slightly on the numbers that really matter.
    Instead of increasing each step by 1 to test numbers I increased by 6 and then tested with value+1 and value-1 for the following reason. All prime numbers greater than 3 are either 6n+1 or 6n-1. This is because all 6n+2 are divisible by 2, all 6n+3 are divisible by 3, and all 6n+4 are divisible by 2. This leaves 6n+1 and 6n-1 cutting down the number of values tested with to 2/6 or 1/3 of all values between 3 and square root of n. Using this method you will end up checking some values that are not prime but it should reduce the complexity significantly over other methods that only check primes and don't use a look up table.
    I was thinking... maybe there might be a way to skip numbers while increasing the test value to decrease time but also decrease effectiveness (decreasing probability of being correct from 100% to maybe something slightly less). Also, maybe the amount of time could be shortened by using an array for larger numbers where the prime density is lower so you cover a larger range for memory usage say square root of your maximum input down to an acceptable memory limit and then for below this range use the above 6n+1 and 6n-1. This would probably cut out a lot of unnecessary checking on larger numbers. Do any of these sound like viable solutions to reduce the time?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • mr pink red style avatar for user brit cruise
      Yes you are on the right track! I was going to post a video on 'wheel factorization' (http://primes.utm.edu/glossary/page.php?sort=WheelFactorization) Basically, we can first check if it's divisible by the first few primes (say: 2,3,5), then we imagine a wheel of size 2*3*5=30 rolling along the number line. The spokes of this wheel are all numbers relatively prime to 2,3,5. Wherever the spokes land, we assume is prime and use as next check. We can make larger wheels, for example starting with 2,3,5,7 will eliminate over 77% of the composites. Eventually larger wheels don't payoff due to memory (good ol' time space tradeoff)

      So these methods, while more efficient, won't solve our problem
      (1 vote)
  • blobby green style avatar for user Richard
    Any chance of getting videos on pollard's p-1 and rho algorithms? How about the Chinese Remainder Theorem? Also something on WEP and why it's so easy to break. Keep up with this awesome series!
    (2 votes)
    Default Khan Academy avatar avatar for user
  • aqualine ultimate style avatar for user video_error
    Now why is calculating how many primes in a certain space so hard? Why can't you just calculate on the prime spiral how many are in a certain space and double it to get the number you need?
    (2 votes)
    Default Khan Academy avatar avatar for user
  • mr pants teal style avatar for user pranav.kirsur
    Is P=NP related to time complexity?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • leaf red style avatar for user 13fnixon
      Yes. The P in P=NP means polynomial. Essentially they are categories for algorithm running speed. Do of the broadest categories are polynomial time and non-polynomial time. Multiplying two numbers is considered a problem that can be solved in polynomial time. There is no known factoring method that is polynomial time. If P=NP is proved true it would mean that anything that can be done in polynomial time can be done in reverse, in polynomial time.
      (1 vote)

Video transcript

Voiceover: Congratulations. We've reached a branching point in our lesson. Now a few different ideas have been introduced, so it's important to orient ourselves here before moving forward in various directions. Now, what motivated this lesson? Well, we learned about RSA encryption, and RSA depended on two things. One, that prime factorization is difficult. So if I multiply two big primes together, P1 and P2 and I give you N, I can trust or feel secure in knowing that you'll take a long time to find those primes, and maybe more than your lifetime. Also two, RSA requires that those large primes I generated was easy, meaning I could just quickly generate a large prime. So let's return to the first problem. Difficulty of prime factorization. Well what is it about prime factorization or just prime numbers themselves which make problems hard? And to find out we begin with a core problem. Given X, is X prime or composite, which is a primality test? Now we ended up building some solutions to this problem. And in doing so, we realize that this problem was computationally expensive. That is, there is no instant solution to this problem. As our input number grew, the amount of time or the amount of steps involved for an algorithm to find the solution also grew. And, how much it grew, we actually understand better now because we can predict this search space using the prime number theorem. Though, the real issue can be thought of as a graph, where on the vertical axis we have the number of steps our algorithm takes, so you can just think of it as time. And on the X-axis is the input size and as input size increased, so did time. And the problem we had is that shape of this curve is unacceptable. Because in our case, once N hit even 20 digits, we can no longer solve the problem in the worst case. And if we threw in input hundreds of digits long at our algorithm we can all agree it would not work. In our case it would crash. So it's practically impossible to check if large input is prime using our current strategies. Now let's return to the first point, factorization. Realize based on our current understanding in this lesson, factorization has to be harder than a primality test. That is there are more steps involved in finding the prime factorization of some number, versus just telling me if a number is prime. Since, remember, factorization requires us to find all the prime factors for some input N, whereas a primality test only requires we find one divisor And a nice reminder of this is that some users have actually turned these primality tests into prime factorization algorithms, simply by repeating after it finds its first divisor. So the primality test is just kind of a sub-routine inside the main factorization algorithm. So if we return to this graph, a factorization algorithm would be something like this. As input grows, our factorization algorithm would be above this primality test line. And realize that in any situation we always have a computational limit, that is the number of primitive steps we can calculate which is based on our computing power in any given situation and the amount of time we have. And you could think of this as a horizontal line, or a threshold on this graph. Anything above this line is out of reach, not feasible to solve. And in this lesson we were limited by the rover's on-board computer which was fairly slow, which is why we couldn't run primality tests on numbers with even 20 digits. However, even if we had, say, 1,000 computers running for a year, this would simply just push this horizontal line up to some other threshold. And this would allow us to run tests on larger numbers, but as you can see, we would always hit some limit where the input is large enough that we can no longer solve the problems again. Now, this leads to many interesting question paths. However, I'll identify the two I'm going to explore next. First of all, so far I have not been very precise about these growth curves. Though, it would be helpful if, imagine for any algorithm you give me, no matter what it's trying to solve, and no matter what machine it's even running on, we could draw some corresponding growth curve on this graph, simply by looking at the description of the algorithm. If we could do this, we could identify categories of certain growth shapes, which tell us how difficult a problem would be to solve. And in this way, we could understand an algorithm before we even run it, very important to think about. And you could hand me your algorithm written on a piece of paper and I could look at it and I'll say, "I know "this algorithm is not solvable with the input size needed." And this leads us to a lesson on time complexity, an incredibly important conceptual tool we will need. And if you've ever heard this runs in polynomial time or exponential time, these are terms which come out of time complexity. Next, many of you have been wondering, "well, how "are we generating these random primes in practice," the second point I made about RSA. Well let's just walk through it quickly. To generate a large random prime number hundreds of digits long, we need to follow these instructions. Generate a random number, test if it's prime, if it's prime, we're done. If not, try again until we hit a prime. Now in this three-step procedure, what's most important is the second step, test if it's prime. If we can't do that step, this won't work. So how is this working today? Well, it's based on a subtle alteration to our problem definition, and more importantly, the use of randomness. And this leads us to another lesson on random algorithms. And now finally, if there are any other lingering question paths you have, please post them below and we can plan lessons. For example, there are some deeper mathematics we have yet to explore in order to speed up our existing trial division primality test. And what else are you thinking of? Please share below.