Main content

## Computer science theory

### Course: Computer science theory > Unit 2

Lesson 6: Primality test- Introduction
- Primality test challenge
- Trial division
- What is computer memory?
- Algorithmic efficiency
- Level 3: Challenge
- Sieve of Eratosthenes
- Level 4: Sieve of Eratosthenes
- Primality test with sieve
- Level 5: Trial division using sieve
- The prime number theorem
- Prime density spiral
- Prime Gaps
- Time space tradeoff
- Summary (what's next?)

© 2023 Khan AcademyTerms of usePrivacy PolicyCookie Notice

# Primality test with sieve

An attempt at an optimal trial division primality test using the Sieve of Eratosthenes. Created by Brit Cruise.

## Want to join the conversation?

- How can there be a memory limitation to future hardware? Won't computers just keep on getting more advanced as time passes?(24 votes)
- If eventually, I were fortunate enough to store bits on individual electrons with spin, I would still need some atoms to hold the bits. It gets difficult for me to store more than a few or a few dozen bits per atom. So I could estimate my number of bits using a small linear multiple of the number of atoms I used. I would be unable to store more information than I could promptly reach in my ideal shoebox full of carefully-coded atoms. Huge amounts of information would require large blocks of matter. For a list of all primes up to 100,000 digits; I may exceed the mass of a small star. There is not much hope that humans will get more clever than we have here dreamed. Here's my future memory limitation.(22 votes)

- what is n in the video and were can i get more help(3 votes)
- n is just the upper limit. If we are looking for primes up to 100 then n = 100(2 votes)

- Are you saying you can't use machine learning?(2 votes)
- Machine learning would not be an effective technique in finding primes. If you trained it on a large set of numbers, you would, in all likelihood, end up with it learning all the primes below 100, and then guessing that all numbers > 100 that aren't divisible by 2,3,5,7,11,13 are prime, as that would be true a very high percent of the time. So, you'd get something that usually guessed correctly (high percentage of the time), but there would also be an infinite amount of numbers it would guess incorrectly.

Of course, we could achieve that same result in a few lines of code, and skip the lengthy training period for the machine learning.(2 votes)

- How many primes can one Petabyte hold?(2 votes)
- Can negative numbers be prime or composite, or can only integers be prime or composite?(1 vote)
- By definition, only integers > 1 may be prime or composite(3 votes)

- What’s about Fermat primality test? I hope you answer my question?(1 vote)
- The Fermat primality test can also be used to check if a number is prime, however, it is a probabilistic test, (unlike the number sieves). This means that there is a chance that the test is incorrect and might identify a composite number as prime.(2 votes)

- Isn't there a infinite amount of primes? When and how will we get such technology to store all these numbers?(1 vote)
- Yes, there are infinite primes, so we can't store all of them at once.(2 votes)

- If 1 is not a prime, is it a "neutral" number?(1 vote)
- 1 is neither a prime or a composte number, it is a unit!

Whole numbers are defined to be either 1. identity element (number zero) 2. a unit (unit are ±1) or 3. unit*∏Pi^αi (from i = 0 to ∞) (being Pi the prime numbers) (this is like -9 = -1*3^2).(2 votes)

- how does the square root of N have to do with computers(1 vote)
- It has nothing to do with computers, and everything to do with numbers. Every composite number a must have a prime factor less than sqrt(x). I have a video explaining why here: http://www.khanacademy.org/cs/level-1-primality-test/1018672065

(press play in the code window)(2 votes)

## Video transcript

Voiceover: Now, to recap We are checking if sum number N is prime by doing a trial division. Here is the square root of N and here is three. Starting at three, we are hopping along by two up until the square root of N. At each point in the way, checking if that point divides N. Now so far, people have
been trying to reduce the number of steps we take by perhaps starting later
and taking larger steps. I just want to pause here and let's think about
what is the ideal case for a trial division algorithm? What is the best we could possibly do if we got very creative in our stepping? Remember, any number N has
some tri-factorization. Let's say the square root of N is here. We actually only need to
step on prime numbers. That would be the best we could do. We know if we step only on primes, we will eventually find a factor. Tri-factor if it's a composite. The question now is how
efficient is this method? It seems like we have
a perfect solution now. If we wrote a new algorithm
which first called the sieve. Let's say the new algorithm
is calculating if N is prime. If you call the sieve and generate a nice long
list of primes for us. Then, we have our trial division, which would use this list of primes. It would hop along and hit only primes up until the square root
of N, wherever that is. What's wrong with this? We can visualize the time complexity or the number of steps taken. Remember I did so by- I quoted up this algorithm
and I put a step counter inside each loop so we have- let's just say step plus plus
means step plus one here. Inside this other loop,
there is also a step counter. Step plus plus. These are all constant operations
checking if and marking. We just had a step
counter inside each loop. Now here is a comparison. On the far left is our
old trial division method. In the middle is our
algorithm calling the sieve to generate all primes up to N. On the right is this proposal
where we just call the sieve to generate primes up
to the square root of N and then call trial division
just on those primes. Let's see what happens with a small input. As we can see initially,
the sieve takes many steps. Even the modified version on the right is actually slower than trial division. As the input grows, the number of steps in the sieves grows even faster. Let's just forget the middle and compare trial division versus the sieve
up to the square root of N plus trial division. Here we can see the old
trial division method is much more efficient. The number of steps in our sieve to the square root of N plus trial division is growing much faster. It is actually not an improvement. Below is the program I
used to do this comparison. There is a recording
explaining how I set it up. Now you may be wondering "Well, what if we calculated
the primes in advance?" The first step would be to
build an array of primes and store it on a hard drive. Then, our algorithm would
just do trial division and it would know how
to hop on primes only because it would be reading
from this proposed prime list. Perhaps our prime list stores
all primes up to 20 digits or even 100 digits. Why can't we do this? The problem is limitations in memory. When we innumerate lists of numbers, which we will explore next. Just for example, let's say
we were doing this by hand. We calculate five was prime, we write five on a piece of paper, and we store it in a filing cabinet. Then we get seven, we store
that in the filing cabinet. Nine, or 11, sorry,
into the filing cabinet. Then we have a filing cabinet
full of prime numbers. This is our- think of it as a prime array. How big would this filing cabinet be if, say, we wanted all
primes up to 20 digits or all primes up to 100 digits long? Could we even store this
array on a hard drive? To understand why this
actually is not possible we have to dive a little deeper into how large does this
actually grow, this prime array, and what is the size
limitation of modern-day and even future computer hardware?