Main content

## AP®︎/College Computer Science Principles

### Course: AP®︎/College Computer Science Principles > Unit 4

Lesson 2: Evaluating algorithms# Verifying an algorithm

AP.CSP:

CRD‑2 (EU)

, CRD‑2.I (LO)

, CRD‑2.I.1 (EK)

, CRD‑2.I.5 (EK)

, CRD‑2.J (LO)

, CRD‑2.J.1 (EK)

, CRD‑2.J.2 (EK)

, CRD‑2.J.3 (EK)

An important aspect of any algorithm is that it is

**correct**: it always produces the expected output for the range of inputs and it eventually terminates.As it turns out, it's difficult to prove that an algorithm is correct. Programmers often use

**empirical analysis**to find faults in an algorithm, but only**formal reasoning**can prove total correctness.### Empirical analysis

An "empirical" analysis is one based on actual experimentation and observation of the results. In the world of algorithms, that means the algorithm must actually be translated into a programming language and executed on a computer.

Let's conduct an empirical analysis of an algorithm that finds the maximum value in a list of numbers.

Here's pseudocode that expresses that algorithm:

```
maxNum ← -1
FOR num IN numbers {
if (num > maxNum) {
maxNum ← num
}
}
```

Next, we'll translate that into the JavaScript language, since we can execute that in our interactive editor here on Khan Academy.

```
var maxNum = -1;
for (var i = 0; i < numbers.length; i++) {
if (numbers[i] > maxNum) {
maxNum = numbers[i];
}
}
```

Then we need to feed input into the algorithm and observe how it performs. For our first experiment, let's give it an array of 4 numbers,

`[13, 4, 24, 7]`

and see if it outputs the max, 24.Hooray, it worked! Can we declare this to be a perfectly correct algorithm and move on with life? I'm afraid it's not quite that easy...

It's time for experiment #2. This time, let's make all the numbers in the array negative,

`[-13, -4, -24, -7]`

. This time, the maximum value should be -4, the smallest negative number in the list.Uh-oh, the code outputted -1 instead of -4. That's because the initial value for

`maxNum`

is -1, and the loop never finds a value in the list greater than that. Our algorithm definitely does *not*work correctly for negative numbers.At this point, we need to modify our algorithm and conduct empirical analysis on the (hopefully) improved algorithm. Let's try out a version of the algorithm that initializes

`maxNum`

to the first number in the list:That works! Or at least, it works on a list of negative numbers. Does it still work on the list of positive numbers? What about a mixed list of positive and negative numbers? What about fractions? Or irrational numbers? There are so many possibilities to test!

We can do the testing more easily by wrapping our algorithm in a procedure, and using a testing library to assert that the output from the procedures matches what we expect. On Khan Academy,

`Program.assertEqual(actual, expected)`

is a simple testing procedure that displays an error whenever the actual output is not equal to the expected. Here's an empirical analysis on four different lists:

No errors there! The new algorithm is looking more correct than the old one. But is it truly correct? We don't actually know that for sure. In fact, we could do many more experiments and

*still*not know.Empirical analysis can only be used to prove that an implemented algorithm is

*not*correct, by discovering inputs where the output is unexpected. However, it cannot prove that an algorithm*is*correct.### Formal reasoning

The only way to prove the correctness of an algorithm over all possible inputs is by reasoning formally or mathematically about it.

One form of reasoning is a "proof by induction", a technique that's also used by mathematicians to prove properties of numerical sequences.

📝

**Exam tip**: The AP CSP exam does*not*require an understanding of how to prove algorithms by induction. We walk through it here to give you a feel for what formal reasoning might look like, but the AP does not expect students to understand this advanced level of mathematic reasoning.A metaphor can help with understanding induction. Imagine we have a line of a million dominos that are perfectly spaced out. How do we know that every domino will fall when we tip the first one over? We don't actually have to check each individual domino. We only have to prove that 1) the first domino will fall, and 2) tipping over any given domino will tip the next domino over. With just those two things proved, just like that, a million dominoes will fall over!

Now let's apply induction to an algorithm. Here's the pseudocode for an algorithm that computes the factorial of a positive integer:

```
PROCEDURE calcFactorial(n) {
factorial ← 1
i ← 1
REPEAT UNTIL (i > n) {
factorial ← factorial * i
i ← i + 1
}
RETURN factorial
}
```

The factorial of a number is the product of that number with all the numbers less than it, down to 1. For example, the factorial of 4, often written as 4, !, is 4, times, 3, times, 2, times, 1, equals, 24.

Before we go down the route of proving this algorithm successfully computes n, !, let's actually try it out for the n of 4. If the algorithm works, it should return 24.

- The variables
`factorial`

and`i`

both start off at`1`

. - Since
`i`

(`1`

) is not greater than`n`

(`4`

), we enter the loop. - Iteration #1:
`factorial`

is set to`1`

(from`1 * 1`

) and`i`

increases to`2`

. - Iteration #2:
`factorial`

is set to`2`

(from`1 * 2`

) and`i`

increases to`3`

. - Iteration #3:
`factorial`

is set to`6`

(from`2 * 3`

) and`i`

increases to`4`

. - Iteration #4:
`factorial`

is set to`24`

(from`6 * 4`

) and`i`

increases to`5`

. - At this point,
`i`

(`5`

)*is*greater than`n`

(`4`

), so we exit the loop. - The procedure returns the value of
`24`

.

Great, we verified that the algorithm computes the correct result for a single integer.

Now let's prove that for

*all positive integers*, the algorithm computes the factorial of the integer.First we need to prove that the algorithm eventually terminates, as an algorithm can't be considered correct if it goes on forever. In this algorithm,

`i`

starts at `1`

and increases by `1`

until it becomes `n + 1`

. Thus, the algorithm always stops after `n`

loop repetitions.Next, to prove that this algorithm outputs the factorial, we will more specifically prove a "loop invariant", a property of the loop that should always be true. In this algorithm, after going through the loop

`n`

times, `factorial`

should equal `n!`

and `i`

should equal `n + 1`

. That was true in our walk-through of `factorial(4)`

, and now we'll attempt to prove that it is generally true for any positive integer.That requires proving 1) the base case, and 2) the induction hypothesis.

**Base case:**This is where we verify that the algorithm holds for the very first number in the range of possible inputs.

For this algorithm, we are proving it for all positive integers, so the base case is when

`n`

is `1`

. According to our loop invariant, after going through the loop `1`

times, `factorial`

should equal `1!`

(`1`

) and `i`

should equal `1 + 1`

(`2`

).We can walk-through our algorithm for

`calcFactorial(1)`

, similar to how we did for the number `4`

:- The variables
`factorial`

and`i`

both start off at`1`

. - Since
`i`

(`1`

) is not greater than`n`

(`1`

), the algorithm enters the loop. - Iteration #1:
`factorial`

is set to`1`

(from`1 * 1`

) and`i`

increases to`2`

. - At this point,
`i`

(`2`

)*is*greater than`n`

(`1`

), so the algorithm exits the loop.

Our loop invariant holds:

`factorial`

stores `1`

and `i`

stores `2`

.With the base case proved, let's move on!

**Induction step:**This is where we show that if it works for any arbitrary number, it also works for the number right after it.

We start with the inductive hypothesis: an assumption that the loop invariant is true for some positive integer

`k`

. After going through the loop `k`

times, `factorial`

should equal `k!`

and `i`

should equal `k + 1`

.Starting from that assumption, we will prove that the loop invariant is also true for

`k + 1`

, the number just after `k`

. After going through the loop `k + 1`

times, `factorial`

should equal `(k + 1)!`

and `i`

should equal `(k + 1) + 1`

.To do that, let's walk-through

`calcFactorial(k + 1)`

. We can fast-forward through the first `k`

repetitions, thanks to our inductive hypothesis.- After
`k`

repetitions,`factorial`

stores`k!`

and`i`

stores`k + 1`

. - Iteration #
`k+1`

:`factorial`

is set to`k! * (k + 1)`

and`i`

increases to`k + 2`

. - At this point,
`i`

(whose value is`k + 2`

)*is*greater than`n`

(whose values is`k + 1`

), so the algorithm exits the loop.

Did the loop invariant hold true? Yes, it did! The variable

`factorial`

stores `k! * (k + 1)`

, which is equivalent to `(k + 1)!`

and the variable `i`

stores `k + 2`

, equivalent to `(k + 1) + 1`

. We can confidently state that the loop invariant is true for all positive integers

`k`

. Since we showed earlier that the loop stops after

`n`

repetitions, then `calcFactorial(n)`

always returns `n!`

. Our algorithm is correct, since it both terminates and produces the correct answer when it terminates.Proof by induction is a technique that works well for algorithms that loop over integers, and can prove that an algorithm always produces correct output. Other styles of proofs can verify correctness for other types of algorithms, like proof by contradiction or proof by exhaustion.

There are definitely drawbacks to this level of formal reasoning: first, most computer programmers lack the mathematical background to verify with proofs, and secondly, the proof is made outside of the code, so the implementation of the algorithm could diverge from the proved version of the algorithm.

The most popular formal technique for writing correct code is to use a programming languages built specifically with provability as a goal. Cloud computing companies like Amazon and Microsoft use verifiable languages for their critical infrastructure, since they can't afford to be brought down by a bug in their algorithms.

Realistically, the majority of software is verified with empirical analysis. That's partially due to the fact that most programmers lack the theoretical background to prove the correctness of algorithms. But it's also due to the ease of empirical analysis and the fact that a well-thought out suite of tests can prove that an algorithm is almost certainly correct-- and that's often good enough.

🙋🏽🙋🏻♀️🙋🏿♂️Do you have any questions about this topic? We'd love to answer—just ask in the questions area below!

## Want to join the conversation?

- What programming languages are specificaly built for provability?(6 votes)
- Functional programming languages such as Haskell. One reason they better lend themselves to proofs is due to the immutability (variables are constant) within the language.(1 vote)

- Can someone please simplify the explanation on how formal analysis works because it's a bit confusing.(3 votes)
- We need to use math and formal logic to prove an algorithm works correctly. A common proof technique is called "induction" (or "proof by loop invariant" when talking about algorithms). Induction works by showing that if a statement is true given an input, it must also be true for the next largest input. (There are actually two different types of induction; this type is called "weak induction".)

When we need to prove an algorithm is correct, we can show that if it works for some input, then it must also work for a larger input. Then, we show that there is a specific example of input that the algorithm works on.

For example, suppose we want to show that a function, MERGE-SORT, will correctly sort a list of numbers. We would prove that if MERGE-SORT sorts a list of n numbers, then it can sort a list of n+1 numbers. After that, we show that MERGE-SORT can sort a list of 1 number. Since it can sort a list of 1 number, it must be able to sort a list with 2 numbers. If we know that it can sort a list with 2 numbers, then it can also sort a list with 3 numbers. (We can repeat this process any number of times.) So, MERGE-SORT can sort a list with any amount of numbers.(3 votes)

- Whuff! It’s getting serious business now. When v started with a simple example of finding the maxNum, the pseudo code appears to be describing a 'for each' loop and the JS script shows a 'for' loop. Am I right?

'FOR num IN numbers' compare with 'for (var i = 0; i < numbers.length; i++)'.

And just curious, wasn’t 'max of' a better option to find the highest number in a list?(3 votes)- Hmm the idea isn't bad, but the pseudo code just shows a for loop. Talking about a for each loop only really makes sense in regards to actual code, as it's a variant of the for loop.

At the beginning it's mostly about getting an idea how to code. Writing code that can be easily performed by inbuilt methods is part of that.

But as soon as you got the idea, then yes max of methods are the way to go.(0 votes)

- What are the programming languages?(1 vote)
- I have a question concerning the "the proof by induction":

* Q#1: In the metaphor given, why is it translated into pseudocode using factorial? since it could be true, but it doesn't mean that the all dominos will fall over one another.

and thank you alot!!(1 vote)- I'm not sure I understand correctly, but the basic idea of induction is to prove a starting point and then show that your claim is also true for the next and so on.

Say 2^n > n^2 for all n >= 5 with n being natural numbers (5, 6, 7, ...)

So you show the base case n = 5

2^5 = 32 > 25 = 5^2

So that's true, now you abstract. Instead of working with 5 you work with n and show that the inequality holds for (n + 1).

2^(n + 1) = 2 * 2^n > 2 * n^2 = n^2 + n^2 > n^2 + 2n + 1 = (n + 1)^2

In the second step we used something we showed at the start (2^n > n^2) and in the fifth step we use n^2 > 2n + 1 for n >= 3 (you can show that with induction as well).

See we showed that the claim worked for a n >= 5 and we showed that it works for (n + 1).

So we know it holds for n = 5 and 6, because (n + 1) is true and it's true for 7 because it's true for n = 6 and (n + 1) is true and so on.

Induction is a very powerful tool, but it takes a bit to understand.(1 vote)

- Can someone explain how we came to this conclusion?
`At this point, i (whose value is k + 2) is greater than n (whose values is k + 1), so the algorithm exits the loop.`

How is:`n = k+1`

?

Not sure what I missed.(1 vote) - My second question is i++ means i = i + 1. If it’s an array that v r referring to then does it mean the next number in the array?(1 vote)
- Say you have an array

{a, b, c, d, e} and i = 2

then

array[i] gets you c and i = 2

array[i + 1] gets you d and i = 2

array[i++] gets you d and i = 3

The third option is considered bad style and you shouldn't do that.

Hope that helps.(0 votes)

- What's the function of proving factorial(4), if we will test the base case later? The base case can prove that it will work empircally. Just like the Domino metaphor, there are two steps, not three in the factorial example.(0 votes)
- With induction, it helps to convince yourself of what you're trying to prove. Also examples can help to solidify understanding.(0 votes)

- Is this also in other computers like DELL or intel or acer?(0 votes)
- I'm not sure what you mean, but most consumer computers work according to the same underlying principles. So most of what you learn here will in some way apply to your computer.(0 votes)

- Could I please have an example of when an algorithm is found to be not true using proof by induction? Additionally, when the article mentions that some companies use "verifiable languages," (this is mentioned near the end of the article) does that mean that a software verifies the code itself?(0 votes)
- If either the base case or recursive step fails, the algorithm is generally incorrect. Here is an example of a case in which the base case proof step fails.

https://en.wikipedia.org/wiki/All_horses_are_the_same_color

And yes, just like humans perform manual reasoning to verify an algorithm, software can perform automated reasoning (to some extend) on algorithms via techniques like proof by induction. The field in computer science is called "formal reasoning"(0 votes)