Main content
AP®︎/College Computer Science Principles
Course: AP®︎/College Computer Science Principles > Unit 4
Lesson 2: Evaluating algorithmsVerifying an algorithm
An important aspect of any algorithm is that it is correct: it always produces the expected output for the range of inputs and it eventually terminates.
As it turns out, it's difficult to prove that an algorithm is correct. Programmers often use empirical analysis to find faults in an algorithm, but only formal reasoning can prove total correctness.
Empirical analysis
An "empirical" analysis is one based on actual experimentation and observation of the results. In the world of algorithms, that means the algorithm must actually be translated into a programming language and executed on a computer.
Let's conduct an empirical analysis of an algorithm that finds the maximum value in a list of numbers.
Here's pseudocode that expresses that algorithm:
maxNum ← -1
FOR num IN numbers {
if (num > maxNum) {
maxNum ← num
}
}
Next, we'll translate that into the JavaScript language, since we can execute that in our interactive editor here on Khan Academy.
var maxNum = -1;
for (var i = 0; i < numbers.length; i++) {
if (numbers[i] > maxNum) {
maxNum = numbers[i];
}
}
Then we need to feed input into the algorithm and observe how it performs. For our first experiment, let's give it an array of 4 numbers,
[13, 4, 24, 7]
and see if it outputs the max, 24.Hooray, it worked! Can we declare this to be a perfectly correct algorithm and move on with life? I'm afraid it's not quite that easy...
It's time for experiment #2. This time, let's make all the numbers in the array negative,
[-13, -4, -24, -7]
. This time, the maximum value should be -4, the smallest negative number in the list.Uh-oh, the code outputted -1 instead of -4. That's because the initial value for
maxNum
is -1, and the loop never finds a value in the list greater than that. Our algorithm definitely does not work correctly for negative numbers.At this point, we need to modify our algorithm and conduct empirical analysis on the (hopefully) improved algorithm. Let's try out a version of the algorithm that initializes
maxNum
to the first number in the list:That works! Or at least, it works on a list of negative numbers. Does it still work on the list of positive numbers? What about a mixed list of positive and negative numbers? What about fractions? Or irrational numbers? There are so many possibilities to test!
We can do the testing more easily by wrapping our algorithm in a procedure, and using a testing library to assert that the output from the procedures matches what we expect. On Khan Academy,
Program.assertEqual(actual, expected)
is a simple testing procedure that displays an error whenever the actual output is not equal to the expected. Here's an empirical analysis on four different lists:
No errors there! The new algorithm is looking more correct than the old one. But is it truly correct? We don't actually know that for sure. In fact, we could do many more experiments and still not know.
Empirical analysis can only be used to prove that an implemented algorithm is not correct, by discovering inputs where the output is unexpected. However, it cannot prove that an algorithm is correct.
Formal reasoning
The only way to prove the correctness of an algorithm over all possible inputs is by reasoning formally or mathematically about it.
One form of reasoning is a "proof by induction", a technique that's also used by mathematicians to prove properties of numerical sequences.
📝 Exam tip: The AP CSP exam does not require an understanding of how to prove algorithms by induction. We walk through it here to give you a feel for what formal reasoning might look like, but the AP does not expect students to understand this advanced level of mathematic reasoning.
A metaphor can help with understanding induction. Imagine we have a line of a million dominos that are perfectly spaced out. How do we know that every domino will fall when we tip the first one over? We don't actually have to check each individual domino. We only have to prove that 1) the first domino will fall, and 2) tipping over any given domino will tip the next domino over. With just those two things proved, just like that, a million dominoes will fall over!
Now let's apply induction to an algorithm. Here's the pseudocode for an algorithm that computes the factorial of a positive integer:
PROCEDURE calcFactorial(n) {
factorial ← 1
i ← 1
REPEAT UNTIL (i > n) {
factorial ← factorial * i
i ← i + 1
}
RETURN factorial
}
The factorial of a number is the product of that number with all the numbers less than it, down to 1. For example, the factorial of , often written as , is .
Before we go down the route of proving this algorithm successfully computes , let's actually try it out for the of . If the algorithm works, it should return .
- The variables
factorial
andi
both start off at1
. - Since
i
(1
) is not greater thann
(4
), we enter the loop. - Iteration #1:
factorial
is set to1
(from1 * 1
) andi
increases to2
. - Iteration #2:
factorial
is set to2
(from1 * 2
) andi
increases to3
. - Iteration #3:
factorial
is set to6
(from2 * 3
) andi
increases to4
. - Iteration #4:
factorial
is set to24
(from6 * 4
) andi
increases to5
. - At this point,
i
(5
) is greater thann
(4
), so we exit the loop. - The procedure returns the value of
24
.
Great, we verified that the algorithm computes the correct result for a single integer.
Now let's prove that for all positive integers, the algorithm computes the factorial of the integer.
First we need to prove that the algorithm eventually terminates, as an algorithm can't be considered correct if it goes on forever. In this algorithm,
i
starts at 1
and increases by 1
until it becomes n + 1
. Thus, the algorithm always stops after n
loop repetitions.Next, to prove that this algorithm outputs the factorial, we will more specifically prove a "loop invariant", a property of the loop that should always be true. In this algorithm, after going through the loop
n
times, factorial
should equal n!
and i
should equal n + 1
. That was true in our walk-through of factorial(4)
, and now we'll attempt to prove that it is generally true for any positive integer.That requires proving 1) the base case, and 2) the induction hypothesis.
Base case: This is where we verify that the algorithm holds for the very first number in the range of possible inputs.
For this algorithm, we are proving it for all positive integers, so the base case is when
n
is 1
. According to our loop invariant, after going through the loop 1
times, factorial
should equal 1!
(1
) and i
should equal 1 + 1
(2
).We can walk-through our algorithm for
calcFactorial(1)
, similar to how we did for the number 4
:- The variables
factorial
andi
both start off at1
. - Since
i
(1
) is not greater thann
(1
), the algorithm enters the loop. - Iteration #1:
factorial
is set to1
(from1 * 1
) andi
increases to2
. - At this point,
i
(2
) is greater thann
(1
), so the algorithm exits the loop.
Our loop invariant holds:
factorial
stores 1
and i
stores 2
.With the base case proved, let's move on!
Induction step: This is where we show that if it works for any arbitrary number, it also works for the number right after it.
We start with the inductive hypothesis: an assumption that the loop invariant is true for some positive integer
k
. After going through the loop k
times, factorial
should equal k!
and i
should equal k + 1
.Starting from that assumption, we will prove that the loop invariant is also true for
k + 1
, the number just after k
. After going through the loop k + 1
times, factorial
should equal (k + 1)!
and i
should equal (k + 1) + 1
.To do that, let's walk-through
calcFactorial(k + 1)
. We can fast-forward through the first k
repetitions, thanks to our inductive hypothesis.- After
k
repetitions,factorial
storesk!
andi
storesk + 1
. - Iteration #
k+1
:factorial
is set tok! * (k + 1)
andi
increases tok + 2
. - At this point,
i
(whose value isk + 2
) is greater thann
(whose values isk + 1
), so the algorithm exits the loop.
Did the loop invariant hold true? Yes, it did! The variable
factorial
stores k! * (k + 1)
, which is equivalent to (k + 1)!
and the variable i
stores k + 2
, equivalent to (k + 1) + 1
. We can confidently state that the loop invariant is true for all positive integers
k
. Since we showed earlier that the loop stops after
n
repetitions, then calcFactorial(n)
always returns n!
. Our algorithm is correct, since it both terminates and produces the correct answer when it terminates.Proof by induction is a technique that works well for algorithms that loop over integers, and can prove that an algorithm always produces correct output. Other styles of proofs can verify correctness for other types of algorithms, like proof by contradiction or proof by exhaustion.
There are definitely drawbacks to this level of formal reasoning: first, most computer programmers lack the mathematical background to verify with proofs, and secondly, the proof is made outside of the code, so the implementation of the algorithm could diverge from the proved version of the algorithm.
The most popular formal technique for writing correct code is to use a programming languages built specifically with provability as a goal. Cloud computing companies like Amazon and Microsoft use verifiable languages for their critical infrastructure, since they can't afford to be brought down by a bug in their algorithms.
Realistically, the majority of software is verified with empirical analysis. That's partially due to the fact that most programmers lack the theoretical background to prove the correctness of algorithms. But it's also due to the ease of empirical analysis and the fact that a well-thought out suite of tests can prove that an algorithm is almost certainly correct-- and that's often good enough.
🙋🏽🙋🏻♀️🙋🏿♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!
Want to join the conversation?
- What programming languages are specificaly built for provability?(7 votes)
- Functional programming languages such as Haskell. One reason they better lend themselves to proofs is due to the immutability (variables are constant) within the language.(2 votes)
- why did they have to make it so confusing(5 votes)
- The topic of proving the correctness of algorithms can be complex and confusing because it involves theoretical concepts and mathematical reasoning. The complexity arises from the fact that algorithms can have different behaviors and outcomes depending on the input data, making it challenging to ensure they always work as expected.
The reason behind the need for proving the correctness of algorithms is to ensure that they consistently produce the expected output for various input scenarios. This is crucial because incorrect algorithms can lead to errors, bugs, and incorrect results in software applications.
Empirical analysis, which involves testing the algorithm with different inputs and observing the results, is a common approach used by programmers to identify faults and errors in algorithms. However, empirical analysis alone cannot guarantee that an algorithm is correct for all possible inputs.
Formal reasoning, such as proof by induction, is a more rigorous approach to prove the correctness of algorithms. It involves logical arguments and mathematical proofs to demonstrate that an algorithm will always produce the correct output for any possible input. While this approach provides stronger guarantees, it requires a deep understanding of mathematics and formal reasoning, which may not be feasible or necessary for all situations.
In reality, most software development relies on empirical analysis because it is more accessible and practical for most programmers. By thoroughly testing algorithms with a variety of inputs, programmers can gain confidence in their correctness. However, it is important to note that empirical analysis cannot provide absolute certainty, as it is impossible to test all possible inputs.(1 vote)
- Can someone please simplify the explanation on how formal analysis works because it's a bit confusing.(3 votes)
- We need to use math and formal logic to prove an algorithm works correctly. A common proof technique is called "induction" (or "proof by loop invariant" when talking about algorithms). Induction works by showing that if a statement is true given an input, it must also be true for the next largest input. (There are actually two different types of induction; this type is called "weak induction".)
When we need to prove an algorithm is correct, we can show that if it works for some input, then it must also work for a larger input. Then, we show that there is a specific example of input that the algorithm works on.
For example, suppose we want to show that a function, MERGE-SORT, will correctly sort a list of numbers. We would prove that if MERGE-SORT sorts a list of n numbers, then it can sort a list of n+1 numbers. After that, we show that MERGE-SORT can sort a list of 1 number. Since it can sort a list of 1 number, it must be able to sort a list with 2 numbers. If we know that it can sort a list with 2 numbers, then it can also sort a list with 3 numbers. (We can repeat this process any number of times.) So, MERGE-SORT can sort a list with any amount of numbers.(3 votes)
- Btw, is verifying an algorithm important?(2 votes)
- Yes, it is important to make sure the algorithm works as intended. If you come up with a new algorithm, but there is some case you didn't consider, it could give an incorrect result which could have serious real-world consequences. One famous example would be the Therac-25 incident. Therac-25 was used to treat cancer with radiation, but due to flaws in the software (and poor overall design), it ended up giving massive amounts of radiation that killed multiple people it was meant to help.(2 votes)
- Can anyone explain me this this is confusing(2 votes)
- An important aspect of an algorithm is that it produces the expected output for a range of inputs and eventually terminates. However, proving that an algorithm is correct can be difficult. Programmers often use empirical analysis, which involves experimenting with the algorithm by translating it into code and observing the results. This analysis can help find faults in the algorithm, but it cannot prove total correctness.
For example, let's consider an algorithm that finds the maximum value in a list of numbers. We can translate it into code and test it with different inputs to see if it produces the expected output. However, empirical analysis can only prove that an implemented algorithm is not correct by discovering inputs where the output is unexpected. It cannot prove that the algorithm is correct for all possible inputs.
To prove the correctness of an algorithm over all possible inputs, we need to use formal reasoning or mathematical techniques. One such technique is proof by induction, where we prove that the algorithm works for a base case and then show that if it works for one case, it also works for the next case. This technique is used by mathematicians to prove properties of numerical sequences.
However, it's important to note that proof by induction is not required for the AP CSP exam. The exam does not expect students to have a deep understanding of formal reasoning. Empirical analysis is generally sufficient to demonstrate the correctness of an algorithm for the exam.
In conclusion, while formal reasoning can prove the correctness of an algorithm for all possible inputs, it can be challenging and requires mathematical expertise. Empirical analysis is more commonly used in practice and can provide strong evidence of correctness, especially when combined with a well-designed set of tests.(1 vote)
- Whuff! It’s getting serious business now. When v started with a simple example of finding the maxNum, the pseudo code appears to be describing a 'for each' loop and the JS script shows a 'for' loop. Am I right?
'FOR num IN numbers' compare with 'for (var i = 0; i < numbers.length; i++)'.
And just curious, wasn’t 'max of' a better option to find the highest number in a list?(3 votes)- Hmm the idea isn't bad, but the pseudo code just shows a for loop. Talking about a for each loop only really makes sense in regards to actual code, as it's a variant of the for loop.
At the beginning it's mostly about getting an idea how to code. Writing code that can be easily performed by inbuilt methods is part of that.
But as soon as you got the idea, then yes max of methods are the way to go.(0 votes)
- What are the programming languages?(1 vote)
- I have a question concerning the "the proof by induction":
* Q#1: In the metaphor given, why is it translated into pseudocode using factorial? since it could be true, but it doesn't mean that the all dominos will fall over one another.
and thank you alot!!(1 vote)- I'm not sure I understand correctly, but the basic idea of induction is to prove a starting point and then show that your claim is also true for the next and so on.
Say 2^n > n^2 for all n >= 5 with n being natural numbers (5, 6, 7, ...)
So you show the base case n = 5
2^5 = 32 > 25 = 5^2
So that's true, now you abstract. Instead of working with 5 you work with n and show that the inequality holds for (n + 1).
2^(n + 1) = 2 * 2^n > 2 * n^2 = n^2 + n^2 > n^2 + 2n + 1 = (n + 1)^2
In the second step we used something we showed at the start (2^n > n^2) and in the fifth step we use n^2 > 2n + 1 for n >= 3 (you can show that with induction as well).
See we showed that the claim worked for a n >= 5 and we showed that it works for (n + 1).
So we know it holds for n = 5 and 6, because (n + 1) is true and it's true for 7 because it's true for n = 6 and (n + 1) is true and so on.
Induction is a very powerful tool, but it takes a bit to understand.(1 vote)
- Can someone explain how we came to this conclusion?
At this point, i (whose value is k + 2) is greater than n (whose values is k + 1), so the algorithm exits the loop.
How is:n = k+1
?
Not sure what I missed.(1 vote) - Could I please have an example of when an algorithm is found to be not true using proof by induction? Additionally, when the article mentions that some companies use "verifiable languages," (this is mentioned near the end of the article) does that mean that a software verifies the code itself?(1 vote)
- If either the base case or recursive step fails, the algorithm is generally incorrect. Here is an example of a case in which the base case proof step fails.
https://en.wikipedia.org/wiki/All_horses_are_the_same_color
And yes, just like humans perform manual reasoning to verify an algorithm, software can perform automated reasoning (to some extend) on algorithms via techniques like proof by induction. The field in computer science is called "formal reasoning"(0 votes)