If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

Verifying an algorithm

An important aspect of any algorithm is that it is correct: it always produces the expected output for the range of inputs and it eventually terminates.
As it turns out, it's difficult to prove that an algorithm is correct. Programmers often use empirical analysis to find faults in an algorithm, but only formal reasoning can prove total correctness.

Empirical analysis

An "empirical" analysis is one based on actual experimentation and observation of the results. In the world of algorithms, that means the algorithm must actually be translated into a programming language and executed on a computer.
Let's conduct an empirical analysis of an algorithm that finds the maximum value in a list of numbers.
Here's pseudocode that expresses that algorithm:
maxNum ← -1
FOR num IN numbers {
  if (num > maxNum) {
    maxNum ← num
  }
}
Next, we'll translate that into the JavaScript language, since we can execute that in our interactive editor here on Khan Academy.
var maxNum = -1;
for (var i = 0; i < numbers.length; i++) { 
  if (numbers[i] > maxNum) {
    maxNum = numbers[i];
  }
}
Then we need to feed input into the algorithm and observe how it performs. For our first experiment, let's give it an array of 4 numbers, [13, 4, 24, 7] and see if it outputs the max, 24.
Hooray, it worked! Can we declare this to be a perfectly correct algorithm and move on with life? I'm afraid it's not quite that easy...
It's time for experiment #2. This time, let's make all the numbers in the array negative, [-13, -4, -24, -7]. This time, the maximum value should be -4, the smallest negative number in the list.
Uh-oh, the code outputted -1 instead of -4. That's because the initial value for maxNum is -1, and the loop never finds a value in the list greater than that. Our algorithm definitely does not work correctly for negative numbers.
At this point, we need to modify our algorithm and conduct empirical analysis on the (hopefully) improved algorithm. Let's try out a version of the algorithm that initializes maxNum to the first number in the list:
That works! Or at least, it works on a list of negative numbers. Does it still work on the list of positive numbers? What about a mixed list of positive and negative numbers? What about fractions? Or irrational numbers? There are so many possibilities to test!
We can do the testing more easily by wrapping our algorithm in a procedure, and using a testing library to assert that the output from the procedures matches what we expect. On Khan Academy, Program.assertEqual(actual, expected) is a simple testing procedure that displays an error whenever the actual output is not equal to the expected.
Here's an empirical analysis on four different lists:
No errors there! The new algorithm is looking more correct than the old one. But is it truly correct? We don't actually know that for sure. In fact, we could do many more experiments and still not know.
Empirical analysis can only be used to prove that an implemented algorithm is not correct, by discovering inputs where the output is unexpected. However, it cannot prove that an algorithm is correct.

Formal reasoning

The only way to prove the correctness of an algorithm over all possible inputs is by reasoning formally or mathematically about it.
One form of reasoning is a "proof by induction", a technique that's also used by mathematicians to prove properties of numerical sequences.
📝 Exam tip: The AP CSP exam does not require an understanding of how to prove algorithms by induction. We walk through it here to give you a feel for what formal reasoning might look like, but the AP does not expect students to understand this advanced level of mathematic reasoning.
A metaphor can help with understanding induction. Imagine we have a line of a million dominos that are perfectly spaced out. How do we know that every domino will fall when we tip the first one over? We don't actually have to check each individual domino. We only have to prove that 1) the first domino will fall, and 2) tipping over any given domino will tip the next domino over. With just those two things proved, just like that, a million dominoes will fall over!
Now let's apply induction to an algorithm. Here's the pseudocode for an algorithm that computes the factorial of a positive integer:
PROCEDURE calcFactorial(n) {
    factorial ← 1
    i ← 1
    REPEAT UNTIL (i > n) {
        factorial ← factorial * i
        i ← i + 1
    }
    RETURN factorial
}
The factorial of a number is the product of that number with all the numbers less than it, down to 1. For example, the factorial of 4, often written as 4!, is 4×3×2×1=24.
Before we go down the route of proving this algorithm successfully computes n!, let's actually try it out for the n of 4. If the algorithm works, it should return 24.
  • The variables factorial and i both start off at 1.
  • Since i (1) is not greater than n (4), we enter the loop.
  • Iteration #1: factorial is set to 1 (from 1 * 1) and i increases to 2.
  • Iteration #2: factorial is set to 2 (from 1 * 2) and i increases to 3.
  • Iteration #3: factorial is set to 6 (from 2 * 3) and i increases to 4.
  • Iteration #4: factorial is set to 24 (from 6 * 4) and i increases to 5.
  • At this point, i (5) is greater than n (4), so we exit the loop.
  • The procedure returns the value of 24.
Great, we verified that the algorithm computes the correct result for a single integer.
Now let's prove that for all positive integers, the algorithm computes the factorial of the integer.
First we need to prove that the algorithm eventually terminates, as an algorithm can't be considered correct if it goes on forever. In this algorithm, i starts at 1 and increases by 1 until it becomes n + 1. Thus, the algorithm always stops after n loop repetitions.
Next, to prove that this algorithm outputs the factorial, we will more specifically prove a "loop invariant", a property of the loop that should always be true. In this algorithm, after going through the loop n times, factorial should equal n! and i should equal n + 1. That was true in our walk-through of factorial(4), and now we'll attempt to prove that it is generally true for any positive integer.
That requires proving 1) the base case, and 2) the induction hypothesis.
Base case: This is where we verify that the algorithm holds for the very first number in the range of possible inputs.
For this algorithm, we are proving it for all positive integers, so the base case is when n is 1. According to our loop invariant, after going through the loop 1 times, factorial should equal 1! (1) and i should equal 1 + 1 (2).
We can walk-through our algorithm for calcFactorial(1), similar to how we did for the number 4:
  • The variables factorial and i both start off at 1.
  • Since i (1) is not greater than n (1), the algorithm enters the loop.
  • Iteration #1: factorial is set to 1 (from 1 * 1) and i increases to 2.
  • At this point, i (2) is greater than n (1), so the algorithm exits the loop.
Our loop invariant holds: factorial stores 1 and i stores 2.
With the base case proved, let's move on!
Induction step: This is where we show that if it works for any arbitrary number, it also works for the number right after it.
We start with the inductive hypothesis: an assumption that the loop invariant is true for some positive integer k. After going through the loop k times, factorial should equal k! and i should equal k + 1.
Starting from that assumption, we will prove that the loop invariant is also true for k + 1, the number just after k. After going through the loop k + 1 times, factorial should equal (k + 1)! and i should equal (k + 1) + 1.
To do that, let's walk-through calcFactorial(k + 1). We can fast-forward through the first k repetitions, thanks to our inductive hypothesis.
  • After k repetitions, factorial stores k! and i stores k + 1.
  • Iteration #k+1: factorial is set to k! * (k + 1) and i increases to k + 2.
  • At this point, i (whose value is k + 2) is greater than n (whose values is k + 1), so the algorithm exits the loop.
Did the loop invariant hold true? Yes, it did! The variable factorial stores k! * (k + 1), which is equivalent to (k + 1)! and the variable i stores k + 2, equivalent to (k + 1) + 1.
We can confidently state that the loop invariant is true for all positive integers k.
Since we showed earlier that the loop stops after n repetitions, then calcFactorial(n) always returns n!. Our algorithm is correct, since it both terminates and produces the correct answer when it terminates.
Proof by induction is a technique that works well for algorithms that loop over integers, and can prove that an algorithm always produces correct output. Other styles of proofs can verify correctness for other types of algorithms, like proof by contradiction or proof by exhaustion.
There are definitely drawbacks to this level of formal reasoning: first, most computer programmers lack the mathematical background to verify with proofs, and secondly, the proof is made outside of the code, so the implementation of the algorithm could diverge from the proved version of the algorithm.
The most popular formal technique for writing correct code is to use a programming languages built specifically with provability as a goal. Cloud computing companies like Amazon and Microsoft use verifiable languages for their critical infrastructure, since they can't afford to be brought down by a bug in their algorithms.
Realistically, the majority of software is verified with empirical analysis. That's partially due to the fact that most programmers lack the theoretical background to prove the correctness of algorithms. But it's also due to the ease of empirical analysis and the fact that a well-thought out suite of tests can prove that an algorithm is almost certainly correct-- and that's often good enough.

🙋🏽🙋🏻‍♀️🙋🏿‍♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

Want to join the conversation?

  • blobby green style avatar for user Itz$hre¥a$-_-!?
    why did they have to make it so confusing
    (11 votes)
    Default Khan Academy avatar avatar for user
    • primosaur ultimate style avatar for user Gonzalo Sandmeier
      The topic of proving the correctness of algorithms can be complex and confusing because it involves theoretical concepts and mathematical reasoning. The complexity arises from the fact that algorithms can have different behaviors and outcomes depending on the input data, making it challenging to ensure they always work as expected.

      The reason behind the need for proving the correctness of algorithms is to ensure that they consistently produce the expected output for various input scenarios. This is crucial because incorrect algorithms can lead to errors, bugs, and incorrect results in software applications.

      Empirical analysis, which involves testing the algorithm with different inputs and observing the results, is a common approach used by programmers to identify faults and errors in algorithms. However, empirical analysis alone cannot guarantee that an algorithm is correct for all possible inputs.

      Formal reasoning, such as proof by induction, is a more rigorous approach to prove the correctness of algorithms. It involves logical arguments and mathematical proofs to demonstrate that an algorithm will always produce the correct output for any possible input. While this approach provides stronger guarantees, it requires a deep understanding of mathematics and formal reasoning, which may not be feasible or necessary for all situations.

      In reality, most software development relies on empirical analysis because it is more accessible and practical for most programmers. By thoroughly testing algorithms with a variety of inputs, programmers can gain confidence in their correctness. However, it is important to note that empirical analysis cannot provide absolute certainty, as it is impossible to test all possible inputs.
      (7 votes)
  • mr pink red style avatar for user Lucas Hagemans
    What programming languages are specificaly built for provability?
    (8 votes)
    Default Khan Academy avatar avatar for user
  • winston default style avatar for user FurqanA27
    Can someone please simplify the explanation on how formal analysis works because it's a bit confusing.
    (4 votes)
    Default Khan Academy avatar avatar for user
    • starky ultimate style avatar for user KLaudano
      We need to use math and formal logic to prove an algorithm works correctly. A common proof technique is called "induction" (or "proof by loop invariant" when talking about algorithms). Induction works by showing that if a statement is true given an input, it must also be true for the next largest input. (There are actually two different types of induction; this type is called "weak induction".)

      When we need to prove an algorithm is correct, we can show that if it works for some input, then it must also work for a larger input. Then, we show that there is a specific example of input that the algorithm works on.

      For example, suppose we want to show that a function, MERGE-SORT, will correctly sort a list of numbers. We would prove that if MERGE-SORT sorts a list of n numbers, then it can sort a list of n+1 numbers. After that, we show that MERGE-SORT can sort a list of 1 number. Since it can sort a list of 1 number, it must be able to sort a list with 2 numbers. If we know that it can sort a list with 2 numbers, then it can also sort a list with 3 numbers. (We can repeat this process any number of times.) So, MERGE-SORT can sort a list with any amount of numbers.
      (3 votes)
  • male robot hal style avatar for user anonymous
    Can anyone explain me this this is confusing
    (3 votes)
    Default Khan Academy avatar avatar for user
    • primosaur ultimate style avatar for user Gonzalo Sandmeier
      An important aspect of an algorithm is that it produces the expected output for a range of inputs and eventually terminates. However, proving that an algorithm is correct can be difficult. Programmers often use empirical analysis, which involves experimenting with the algorithm by translating it into code and observing the results. This analysis can help find faults in the algorithm, but it cannot prove total correctness.

      For example, let's consider an algorithm that finds the maximum value in a list of numbers. We can translate it into code and test it with different inputs to see if it produces the expected output. However, empirical analysis can only prove that an implemented algorithm is not correct by discovering inputs where the output is unexpected. It cannot prove that the algorithm is correct for all possible inputs.

      To prove the correctness of an algorithm over all possible inputs, we need to use formal reasoning or mathematical techniques. One such technique is proof by induction, where we prove that the algorithm works for a base case and then show that if it works for one case, it also works for the next case. This technique is used by mathematicians to prove properties of numerical sequences.

      However, it's important to note that proof by induction is not required for the AP CSP exam. The exam does not expect students to have a deep understanding of formal reasoning. Empirical analysis is generally sufficient to demonstrate the correctness of an algorithm for the exam.

      In conclusion, while formal reasoning can prove the correctness of an algorithm for all possible inputs, it can be challenging and requires mathematical expertise. Empirical analysis is more commonly used in practice and can provide strong evidence of correctness, especially when combined with a well-designed set of tests.
      (1 vote)
  • leaf blue style avatar for user Abiha
    Btw, is verifying an algorithm important?
    (2 votes)
    Default Khan Academy avatar avatar for user
    • starky ultimate style avatar for user KLaudano
      Yes, it is important to make sure the algorithm works as intended. If you come up with a new algorithm, but there is some case you didn't consider, it could give an incorrect result which could have serious real-world consequences. One famous example would be the Therac-25 incident. Therac-25 was used to treat cancer with radiation, but due to flaws in the software (and poor overall design), it ended up giving massive amounts of radiation that killed multiple people it was meant to help.
      (2 votes)
  • blobby green style avatar for user NAVEED RIAZ
    Whuff! It’s getting serious business now. When v started with a simple example of finding the maxNum, the pseudo code appears to be describing a 'for each' loop and the JS script shows a 'for' loop. Am I right?
    'FOR num IN numbers' compare with 'for (var i = 0; i < numbers.length; i++)'.
    And just curious, wasn’t 'max of' a better option to find the highest number in a list?
    (3 votes)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      Hmm the idea isn't bad, but the pseudo code just shows a for loop. Talking about a for each loop only really makes sense in regards to actual code, as it's a variant of the for loop.

      At the beginning it's mostly about getting an idea how to code. Writing code that can be easily performed by inbuilt methods is part of that.
      But as soon as you got the idea, then yes max of methods are the way to go.
      (0 votes)
  • blobby green style avatar for user 2023hildebrandtzacharye
    What are the programming languages?
    (1 vote)
    Default Khan Academy avatar avatar for user
  • blobby green style avatar for user jbmolina413
    I am trying to learn AI and Machine Learning. These set of units are some of the worst experiences of learning I have ever endured. Make it easier Khan Academy for people to understand.
    (1 vote)
    Default Khan Academy avatar avatar for user
  • aqualine sapling style avatar for user فاطمة الزهراء
    I have a question concerning the "the proof by induction":
    * Q#1: In the metaphor given, why is it translated into pseudocode using factorial? since it could be true, but it doesn't mean that the all dominos will fall over one another.
    and thank you alot!!
    (1 vote)
    Default Khan Academy avatar avatar for user
    • aqualine ultimate style avatar for user Martin
      I'm not sure I understand correctly, but the basic idea of induction is to prove a starting point and then show that your claim is also true for the next and so on.

      Say 2^n > n^2 for all n >= 5 with n being natural numbers (5, 6, 7, ...)

      So you show the base case n = 5

      2^5 = 32 > 25 = 5^2

      So that's true, now you abstract. Instead of working with 5 you work with n and show that the inequality holds for (n + 1).

      2^(n + 1) = 2 * 2^n > 2 * n^2 = n^2 + n^2 > n^2 + 2n + 1 = (n + 1)^2

      In the second step we used something we showed at the start (2^n > n^2) and in the fifth step we use n^2 > 2n + 1 for n >= 3 (you can show that with induction as well).

      See we showed that the claim worked for a n >= 5 and we showed that it works for (n + 1).
      So we know it holds for n = 5 and 6, because (n + 1) is true and it's true for 7 because it's true for n = 6 and (n + 1) is true and so on.


      Induction is a very powerful tool, but it takes a bit to understand.
      (1 vote)
  • blobby green style avatar for user barroodapps
    Can anyone use the formal reasoning method to prove the findMaxVal function in the above example is correct in all the aspect so that I can compare it with empirical analysis? Thank you!
    (1 vote)
    Default Khan Academy avatar avatar for user
    • blobby blue style avatar for user joshua
      Base Case: Array size = 1.
      Well since
      var maxNum = numbers[0];

      base case is then proved.

      Assume findMaxVal() is true for k-th sized array.
      Now we need to prove that findMaxVal() is also true for (k+1)-th sized array.

      For the first k-th elements, since we assumed findMaxVal() is true for those elements, maxNum holds the maximum value.

      Compare the maximum value found in the first part with the last element of the array.

      If the maximum value in the first part is greater than the last element, then it is the overall maximum value for the array of size k+1.

      If the maximum value in the first part is less than or equal to the last element, then the last element is the overall maximum value for the array of size k+1.

      Therefore, we proved findMaxVal() is true for any sized array.
      (1 vote)