Main content

# Proof of the Cauchy-Schwarz inequality

## Video transcript

Let's say that I have two nonzero vectors. Let's say the first vector is x, the second vector is y. They're both in the set Rn and they're nonzero. It turns out that the absolute value of their-- let me do it in a different color. This color's nice. The absolute value of their dot product of the two vectors-- and remember, this is just a scalar quantity-- is less than or equal to the product of their lengths. And we've defined the dot product and we've defined lengths already. It's less than or equal to the product of their lengths and just to push it even further, the only time that this is equal, so the dot product of the two vectors is only going to be equal to the lengths of this-- the equal and the less than or equal apply only in the situation-- let me write that down-- where one of these vectors is a scalar multiple of the other. Or they're collinear. You know, one's just kind of the longer or shorter version of the other one. So only in the situation where let's just say x is equal to some scalar multiple of y. These inequalities or I guess the equality of this inequality, this is called the Cauchy-Schwarz Inequality. So let's prove it because you can't take something like this just at face value. You shouldn't just accept that. So let me just construct a somewhat artificial function. Let me construct some function of-- that's a function of some variables, some scalar t. Let me define p of t to be equal to the length of the vector t times the vector-- some scalar t times the vector y minus the vector x. It's the length of this vector. This is going to be a vector now. That squared. Now before I move forward I want to make one little point here. If I take the length of any vector, I'll do it here. Let's say I take the length of some vector v. I want you to accept that this is going to be a positive number, or it's at least greater than or equal to 0. Because this is just going to be each of its terms squared. v2 squared all the way to vn squared. All of these are real numbers. When you square a real number, you get something greater than or equal to 0. When you sum them up, you're going to have something greater than or equal to 0. And you take the square root of it, the principal square root, the positive square root, you're going to have something greater than or equal to 0. So the length of any real vector is going to be greater than or equal to 0. So this is the length of a real vector. So this is going to be greater than or equal to 0. Now, in the previous video, I think it was two videos ago, I also showed that the magnitude or the length of a vector squared can also be rewritten as the dot product of that vector with itself. So let's rewrite this vector that way. The length of this vector squared is equal to the dot product of that vector with itself. So it's ty minus x dot ty minus x. In the last video, I showed you that you can treat a multiplication or you can treat the dot product very similar to regular multiplication when it comes to the associative, distributive and commutative properties. So when you multiplied these, you know, you could kind of view this as multiplying these two binomials. You can do it the same way as you would just multiply two regular algebraic binomials. You're essentially just using the distributive property. But remember, this isn't just regular multiplication. This is the dot product we're doing. This is vector multiplication or one version of vector multiplication. So if we distribute it out, this will become ty dot ty. So let me write that out. That'll be ty dot ty. And then we'll get a minus-- let me do it this way. Then we get the minus x times this ty. Instead of saying times, I should be very careful to say dot. So minus x dot ty. And then you have this ty times this minus x. So then you have minus ty dot x. And then finally, you have the x's dot with each other. And you can view them as minus 1x dot minus 1x. You could say plus minus 1x. I could just view this as plus minus 1 or plus minus 1. So this is minus 1x dot minus 1x. So let's see. So this is what my whole expression simplified to or expanded to. I can't really call this a simplification. But we can use the fact that this is commutative and associative to rewrite this expression right here. This is equal to y dot y times t squared. t is just a scalar. Minus-- and actually, this is 2. These two things are equivalent. They're just rearrangements of the same thing and we saw that the dot product is associative. So this is just equal to 2 times x dot y times t. And I should do that in maybe a different color. So these two terms result in that term right there. And then if you just rearrange these you have a minus 1 times a minus 1. They cancel out, so those will become plus and you're just left with plus x dot x. And I should do that in a different color as well. I'll do that in an orange color. So those terms end up with that term. Then of course, that term results in that term. And remember, all I did is I rewrote this thing and said, look. This has got to be greater than or equal to 0. So I could rewrite that here. This thing is still just the same thing. I've just rewritten it. So this is all going to be greater than or equal to 0. Now let's make a little bit of a substitution just to clean up our expression a little bit. And we'll later back substitute into this. Let's define this as a. Let's define this piece right here as b. So the whole thing minus 2x dot y. I'll leave the t there. And let's define this or let me just define this right here as c. X dot x as c. So then, what does our expression become? It becomes a times t squared minus-- I want to be careful with the colors-- b times t plus c. And of course, we know that it's going to be greater than or equal to 0. It's the same thing as this up here, greater than or equal to 0. I could write p of t here. Now this is greater than or equal to 0 for any t that I put in here. For any real t that I put in there. Let me evaluate our function at b over 2a. And I can definitely do this because what was a? I just have to make sure I'm not dividing by 0 any place. So a was this vector dotted with itself. And we said this was a nonzero vector. So this is the square of its length. It's a nonzero vector, so some of these terms up here would end up becoming positively when you take its length. So this thing right here is nonzero. This is a nonzero vector. Then 2 times the dot product with itself is also going to be nonzero. So we can do this. We don't worry about dividing by 0, whatever else. But what will this be equal to? This'll be equal to-- and I'll just stick to the green. It takes too long to keep switching between colors. This is equal to a times this expression squared. So it's b squared over 4a squared. I just squared 2a to get the 4a squared. Minus b times this. So b times-- this is just regular multiplication. b times b over 2a. Just write regular multiplication there. Plus c. And we know all of that is greater than or equal to 0. Now if we simplify this a little bit, what do we get? Well this a cancels out with this exponent there and you end up with a b squared right there. So we get b squared over 4a minus b squared over 2a. That's that term over there. Plus c is greater than or equal to 0. Let me rewrite this. If I multiply the numerator and denominator of this by 2, what do I get? I get 2b squared over 4a. And the whole reason I did that is to get a common denominator here. So what do you get? You get b squared over 4a minus 2b squared over 4a. So what do these two terms simplify to? Well the numerator is b squared minus 2b squared. So that just becomes minus b squared over 4a plus c is greater than or equal to 0. These two terms add up to this one right here. Now if we add this to both sides of the equation, we get c is greater than or equal to b squared over 4a. It was a negative on the left-hand side. If I add it to both sides it's going to be a positive on the right-hand side. We're approaching something that looks like an inequality, so let's back substitute our original substitutions to see what we have now. So where was my original substitutions that I made? It was right here. And actually, just to simplify more, let me multiply both sides by 4a. I said a, not only is it nonzero, it's going to be positive. This is the square of its length. And I already showed you that the length of any real vector's going to be positive. And the reason why I'm taking great pains to show that a is positive is because if I multiply both sides of it I don't want to change the inequality sign. So let me multiply both sides of this by a before I substitute. So we get 4ac is greater than or equal to b squared. There you go. And remember, I took great pains. I just said a is definitely a positive number because it is essentially the square of the length. y dot y is the square of the length of y, and that's a positive value. It has to be positive. We're dealing with real vectors. Now let's back substitute this. So 4 times a, 4 times y dot y. y dot y is also-- I might as well just write it there. y dot y is the same thing as the magnitude of y squared. That's y dot y. This is a. y dot y, I showed you that in the previous video. Times c. c is x dot x. Well x dot x is the same thing as the length of vector x squared. So this was c. So 4 times a times c is going to be greater than or equal to b squared. Now what was b? b was this thing here. So b squared would be 2 times x dot y squared. So we've gotten to this result so far. And so what can we do with this? Oh sorry, and this whole thing is squared. This whole thing right here is b. So let's see if we can simplify this. So we get-- let me switch to a different color. 4 times the length of y squared times the length of x squared is greater than or equal to-- if we squared this quantity right here, we get 4 times x dot y. 4 times x dot y times x dot y. Actually, even better, let me just write it like this. Let me just write 4 times x dot y squared. Now we can divide both sides by 4. That won't change our inequality. So that just cancels out there. And now let's take the square root of both sides of this equation. So the square roots of both sides of this equation-- these are positive values, so the square root of this side is the square root of each of its terms. That's just an exponent property. So if you take the square root of both sides you get the length of y times the length of x is greater than or equal to the square root of this. And we're going to take the positive square root. We're going to take the positive square root on both sides of this equation. That keeps us from having to mess with anything on the inequality or anything like that. So the positive square root is going to be the absolute value of x dot y. And I want to be very careful to say this is the absolute value because it's possible that this thing right here is a negative value. But when you square it, you want to be careful that when you take the square root of it that you stay a positive value. Because otherwise when we take the principal square root, we might mess with the inquality. We're taking the positive square root, which will be-- so if you take the absolute value, you're ensuring that it's going to be positive. But this is our result. The absolute value of the dot product of our vectors is less than the product of the two vectors lengths. So we got our Cauchy-Schwarz inequality. Now the last thing I said is look, what happens if x is equal to some scalar multiple of y? Well in that case, what's the absolute value? The absolute value of x dot y? Well that equals-- that equals what? If we make the substitution that equals the absolute value of c times y. That's just x dot y, which is equal to just from the associative property. It's equal to the absolute value of c times-- we want to make sure our absolute value, keep everything positive. y dot y. Well this is just equal to c times the magnitude of y-- the length of y squared. Well that just is equal to the magnitude of c times-- or the absolute value of our scalar c times our length of y. Well this right here, I can rewrite this. I mean you can prove this to yourself if you don't believe it, but this-- we could put the c inside of the magnitude and that could be a good exercise for you to prove. But it's pretty straightforward. You just do the definition of length. And you multiply it by c. This is equal to the magnitude of cy times-- let me say the length of cy times the length of y. I've lost my vector notation someplace over here. There you go. Now, this is x. So this is equal to the length of x times the length of y. So I showed you kind of the second part of the Cauchy-Schwarz Inequality that this is only equal to each other if one of them is a scalar multiple of the other. If you're a little uncomfortable with some of these steps I took, it might be a good exercise to actually prove it. For example, to prove that the absolute value of c times the length of the vector y is the same thing as the length of c times y. Anyway, hopefully you found this pretty useful. The Cauchy-Schwarz Inequality we'll use a lot when we prove other results in linear algebra. And in a future video, I'll give you a little more intuition about why this makes a lot of sense relative to the dot product.