Linear combinations and span
Understanding linear combinations and spans of vectors. Created by Sal Khan.
Want to join the conversation?
- Around13:50when Sal gives a generalized mathematical definition of "span" he defines "i" as having to be greater than one and less than "n". Is this because "i" is indicating the instances of the variable "c" or is there something in the definition I'm missing?(17 votes)
- i Is just a variable that's used to denote a number of subscripts, so yes it's just a number of instances. If you don't know what a subscript is, think about this. If you wanted two different values called x, you couldn't just make x = 10 and x = 5 because you'd get confused over which was which. So you call one of them x1 and one x2, which could equal 10 and 5 respectively.(13 votes)
- Does Sal mean that to represent the whole R2 two vectos need to be linearly independent, and linearly dependent vectors can't fill in the whole R2 plane?(11 votes)
- Yes. And, in general, if you have n linearly independent vectors, then you can represent Rn by the set of their linear combinations. If you have n vectors, but just one of them is a linear combination of the others, then you have n - 1 linearly independent vectors, and thus you can represent R(n - 1). So in the case of vectors in R2, if they are linearly dependent, that means they are on the same line, and could not possibly flush out the whole plane.
Feel free to ask more questions if this was unclear. Cheers(25 votes)
- What would the span of the zero vector be? Would it be the zero vector as well?
My text also says that there is only one situation where the span would not be infinite. I thought this may be the span of the zero vector, but on doing some problems, I have several which have a span of the empty set. This happens when the matrix row-reduces to the identity matrix. So in which situation would the span not be infinite?(15 votes)
- I understand the concept theoretically, but where can I find numerical questions/examples...(22 votes)
- I'm really confused about why the top equation was multiplied by -2 at17:20. Surely it's not an arbitrary number, right?(10 votes)
- Sal was setting up the elimination step. The next thing he does is add the two equations and the C_1 variable is eliminated allowing us to solve for C_2. Multiplying by -2 was the easiest way to get the C_1 term to cancel.
Another question is why he chooses to use elimination. The first equation is already solved for C_1 so it would be very easy to use substitution. He may have chosen elimination because that is how we work with matrices.(10 votes)
- At12:39when he is describing the i and j vector, he writes them as [1, 0] and [0,1] respectively yet on drawing them he draws them to a scale of [2,0] and [0,2]. Is this an honest mistake or is it just a property of unit vectors having no fixed dimension?(9 votes)
- No, that looks like a mistake, he must of been thinking that each square was of unit one and not the unit 2 marker as stated on the scale.(13 votes)
- Shouldnt it be 1/3 (x2 - 2 (!!) x1) 18 min in? Pretty sure.(8 votes)
- I think I agree with you if you mean you get -2 in the denominator of the answer.(3 votes)
- At17:38, Sal "adds" the equations for x1 and x2 together. I don't understand how this is even a valid thing to do. The first equation finds the value for x1, and the second equation finds the value for x2. I get that you can multiply both sides of an equation by the same value to create an equivalent equation and that you might do so for purposes of elimination, but how can you just "add" the two distinct equations for x1 and x2 together? What does that even mean?(7 votes)
- You know that both sides of an equation have the same value. Let's call that value A.
You can add A to both sides of another equation.
But A has been expressed in two different ways; the left side and the right side of the first equation. Let's call those two expressions A1 and A2.
Remember that A1=A2=A. Since you can add A to both sides of another equation, you can also add A1 to one side and A2 to the other side - because A1=A2.
Another way to explain it - consider two equations:
L1 = R1
L2 = R2
Add L1 to both sides of the second equation:
L2 + L1 = R2 + L1
Since L1=R1, we can substitute R1 for L1 on the right hand side:
L2 + L1 = R2 + R1
And that's pretty much it.
If that's too hard to follow, just take it on faith that it works and move on.(8 votes)
- In the video at0:32, Sal says we are in R^n, but then the correction says we are in R^m. Why does it have to be R^m? Is it because the number of vectors doesn't have to be the same as the size of the space?(6 votes)
- Correct. The number of vectors don't have to be the same as the dimension you're working within.(7 votes)
- Since we've learned in earlier lessons that vectors can have any origin, this seems to imply that all combinations of vector A and/or vector B would represent R^2 in a 2D real coordinate space just by moving the origin around. I'm going to assume the origin must remain static for this reason. So this brings me to my question: how does one refer to the line in reference when it's just a line that can't be represented by coordinate points? Sal just draws an arrow to it, and I have no idea how to refer to it mathematically speaking.(4 votes)
- It's true that you can decide to start a vector at any point in space. But the "standard position" of a vector implies that it's starting point is the origin. If nothing is telling you otherwise, it's safe to assume that a vector is in it's standard position; and for the purposes of spaces and
span, all vectors are considered to be in standard position.
Now, to represent a line as a set of vectors, you have to include in the set all the vector that (in standard position) end at a point in the line.(8 votes)
One term you are going to hear a lot of in these videos, and in linear algebra in general, is the idea of a linear combination. And all a linear combination of vectors are, they're just a linear combination. Let me show you what that means. So let's say I have a couple of vectors, v1, v2, and it goes all the way to vn. And they're all in, you know, it can be in R2 or Rn. Let's say that they're all in Rn. They're in some dimension of real space, I guess you could call it, but the idea is fairly simple. A linear combination of these vectors means you just add up the vectors. It's some combination of a sum of the vectors, so v1 plus v2 plus all the way to vn, but you scale them by arbitrary constants. So you scale them by c1, c2, all the way to cn, where everything from c1 to cn are all a member of the real numbers. That's all a linear combination is. Let me show you a concrete example of linear combinations. Let me make the vector. Let me define the vector a to be equal to-- and these are all bolded. These purple, these are all bolded, just because those are vectors, but sometimes it's kind of onerous to keep bolding things. So let's just say I define the vector a to be equal to 1, 2. And I define the vector b to be equal to 0, 3. What is the linear combination of a and b? Well, it could be any constant times a plus any constant times b. So it could be 0 times a plus-- well, it could be 0 times a plus 0 times b, which, of course, would be what? That would be 0 times 0, that would be 0, 0. That would be the 0 vector, but this is a completely valid linear combination. And we can denote the 0 vector by just a big bold 0 like that. I could do 3 times a. I'm just picking these numbers at random. 3 times a plus-- let me do a negative number just for fun. So I'm going to do plus minus 2 times b. What is that equal to? Let's figure it out. Let me write it out. It's 3 minus 2 times 0, so minus 0, and it's 3 times 2 is 6. 6 minus 2 times 3, so minus 6, so it's the vector 3, 0. This is a linear combination of a and b. I can keep putting in a bunch of random real numbers here and here, and I'll just get a bunch of different linear combinations of my vectors a and b. If I had a third vector here, if I had vector c, and maybe that was just, you know, 7, 2, then I could add that to the mix and I could throw in plus 8 times vector c. These are all just linear combinations. Now why do we just call them combinations? Why do you have to add that little linear prefix there? Because we're just scaling them up. We're not multiplying the vectors times each other. We haven't even defined what it means to multiply a vector, and there's actually several ways to do it. But, you know, we can't square a vector, and we haven't even defined what this means yet, but this would all of a sudden make it nonlinear in some form. So all we're doing is we're adding the vectors, and we're just scaling them up by some scaling factor, so that's why it's called a linear combination. Now you might say, hey Sal, why are you even introducing this idea of a linear combination? Because I want to introduce the idea, and this is an idea that confounds most students when it's first taught. I think it's just the very nature that it's taught. Over here, I just kept putting different numbers for the weights, I guess we could call them, for c1 and c2 in this combination of a and b, right? Let's ignore c for a little bit. I just put in a bunch of different numbers there. But it begs the question: what is the set of all of the vectors I could have created? And this is just one member of that set. But what is the set of all of the vectors I could've created by taking linear combinations of a and b? So let me draw a and b here. Maybe we can think about it visually, and then maybe we can think about it mathematically. So let's say a and b. So a is 1, 2. So 1, 2 looks like that. That's vector a. Let me do vector b in a different color. We're going to do it in yellow. Vector b is 0, 3. So vector b looks like that: 0, 3. So what's the set of all of the vectors that I can represent by adding and subtracting these vectors? And we said, if we multiply them both by zero and add them to each other, we end up there. If we take 3 times a, that's the equivalent of scaling up a by 3. So you go 1a, 2a, 3a. So that's 3a, 3 times a will look like that. So this vector is 3a, and then we added to that 2b, right? Oh no, we subtracted 2b from that, so minus b looks like this. Minus 2b looks like this. This is minus 2b, all the way, in standard form, standard position, minus 2b. So if you add 3a to minus 2b, we get to this vector. 3a to minus 2b, you get this vector right here, and that's exactly what we did when we solved it mathematically. You get the vector 3, 0. You get this vector right here, 3, 0. But this is just one combination, one linear combination of a and b. Instead of multiplying a times 3, I could have multiplied a times 1 and 1/2 and just gotten right here. So 1 and 1/2 a minus 2b would still look the same. It would look like something like this. It would look something like-- let me make sure I'm doing this-- it would look something like this. And so our new vector that we would find would be something like this. So I just showed you, I can find this vector with a linear combination. I can find this vector with a linear combination. And actually, it turns out that you can represent any vector in R2 with some linear combination of these vectors right here, a and b. Now, let's just think of an example, or maybe just try a mental visual example. Wherever we want to go, we could go arbitrarily-- we could scale a up by some arbitrary value. So this is some weight on a, and then we can add up arbitrary multiples of b. B goes straight up and down, so we can add up arbitrary multiples of b to that. So we could get any point on this line right there. Now, if we scaled a up a little bit more, and then added any multiple b, we'd get anything on that line. If we multiplied a times a negative number and then added a b in either direction, we'll get anything on that line. We can keep doing that. And there's no reason why we can't pick an arbitrary a that can fill in any of these gaps. If we want a point here, we just take a little smaller a, and then we can add all the b's that fill up all of that line. So we can fill up any point in R2 with the combinations of a and b. So what we can write here is that the span-- let me write this word down. The span of the vectors a and b-- so let me write that down-- it equals R2 or it equals all the vectors in R2, which is, you know, it's all the tuples. R2 is all the tuples made of two ordered tuples of two real numbers. So it equals all of R2. This just means that I can represent any vector in R2 with some linear combination of a and b. And you're like, hey, can't I do that with any two vectors? Well, what if a and b were the vector-- let's say the vector 2, 2 was a, so a is equal to 2, 2, and let's say that b is the vector minus 2, minus 2, so b is that vector. So b is the vector minus 2, minus 2. Now, can I represent any vector with these? Well, I can scale a up and down, so I can scale a up and down to get anywhere on this line, and then I can add b anywhere to it, and b is essentially going in the same direction. It's just in the opposite direction, but I can multiply it by a negative and go anywhere on the line. So any combination of a and b will just end up on this line right here, if I draw it in standard form. It'll be a vector with the same slope as either a or b, or same inclination, whatever you want to call it. I could never-- there's no combination of a and b that I could represent this vector, that I could represent vector c. I just can't do it. I can add in standard form. I could just keep adding scale up a, scale up b, put them heads to tails, I'll just get the stuff on this line. I'll never get to this. So in this case, the span-- and I want to be clear. This is for this particular a and b, not for the a and b-- for this blue a and this yellow b, the span here is just this line. It's just this line. It's not all of R2. So this isn't just some kind of statement when I first did it with that example. It's like, OK, can any two vectors represent anything in R2? Well, no. I just showed you two vectors that can't represent that. What is the span of the 0 vector? I'll put a cap over it, the 0 vector, make it really bold. Well, the 0 vector is just 0, 0, so I don't care what multiple I put on it. The span of it is all of the linear combinations of this, so essentially, I could put arbitrary real numbers here, but I'm just going to end up with a 0, 0 vector. So the span of the 0 vector is just the 0 vector. The only vector I can get with a linear combination of this, the 0 vector by itself, is just the 0 vector itself. Likewise, if I take the span of just, you know, let's say I go back to this example right here. My a vector was right like that. Let me draw it in a better color. My a vector looked like that. If I were to ask just what the span of a is, it's all the vectors you can get by creating a linear combination of just a. So it's really just scaling. You can't even talk about combinations, really. So it's just c times a, all of those vectors. And we saw in the video where I parametrized or showed a parametric representation of a line, that this, the span of just this vector a, is the line that's formed when you just scale a up and down. So span of a is just a line. You have to have two vectors, and they can't be collinear, in order span all of R2. And I haven't proven that to you yet, but we saw with this example, if you pick this a and this b, you can represent all of R2 with just these two vectors. Now, the two vectors that you're most familiar with to that span R2 are, if you take a little physics class, you have your i and j unit vectors. And in our notation, i, the unit vector i that you learned in physics class, would be the vector 1, 0. So this is i, that's the vector i, and then the vector j is the unit vector 0, 1. This is what you learned in physics class. Let me do it in a different color. This is j. j is that. And you learned that they're orthogonal, and we're going to talk a lot more about what orthogonality means, but in our traditional sense that we learned in high school, it means that they're 90 degrees. But you can clearly represent any angle, or any vector, in R2, by these two vectors. And the fact that they're orthogonal makes them extra nice, and that's why these form-- and I'm going to throw out a word here that I haven't defined yet. These form the basis. These form a basis for R2. In fact, you can represent anything in R2 by these two vectors. line. I'm not going to even define what basis is. That's going to be a future video. But let me just write the formal math-y definition of span, just so you're satisfied. So if I were to write the span of a set of vectors, v1, v2, all the way to vn, that just means the set of all of the vectors, where I have c1 times v1 plus c2 times v2 all the way to cn-- let me scroll over-- all the way to cn vn. So this is a set of vectors because I can pick my ci's to be any member of the real numbers, and that's true for i-- so I should write for i to be anywhere between 1 and n. All I'm saying is that look, I can multiply each of these vectors by any value, any arbitrary value, real value, and then I can add them up. And now the set of all of the combinations, scaled-up combinations I can get, that's the span of these vectors. You can kind of view it as the space of all of the vectors that can be represented by a combination of these vectors right there. And so the word span, I think it does have an intuitive sense. I mean, if I say that, you know, in my first example, I showed you those two vectors span, or a and b spans R2. I wrote it right here. That tells me that any vector in R2 can be represented by a linear combination of a and b. And actually, just in case that visual kind of pseudo-proof doesn't do you justice, let me prove it to you algebraically. I'm telling you that I can take-- let's say I want to represent, you know, I have some-- let me rewrite my a's and b's again. So this was my vector a. It was 1, 2, and b was 0, 3. Let me remember that. So my vector a is 1, 2, and my vector b was 0, 3. Now my claim was that I can represent any point. Let's say I want to represent some arbitrary point x in R2, so its coordinates are x1 and x2. I need to be able to prove to you that I can get to any x1 and any x2 with some combination of these guys. So let's say that my combination, I say c1 times a plus c2 times b has to be equal to my vector x. Let me show you that I can always find a c1 or c2 given that you give me some x's. So let's just write this right here with the actual vectors being represented in their kind of column form. So we have c1 times this vector plus c2 times the b vector 0, 3 should be able to be equal to my x vector, should be able to be equal to my x1 and x2, where these are just arbitrary. So let's see if I can set that to be true. So if this is true, then the following must be true. c1 times 1 plus 0 times c2 must be equal to x1. We just get that from our definition of multiplying vectors times scalars and adding vectors. And then we also know that 2 times c2-- sorry. c1 times 2 plus c2 times 3, 3c2, should be equal to x2. Now, if I can show you that I can always find c1's and c2's given any x1's and x2's, then I've proven that I can get to any point in R2 using just these two vectors. So let me see if I can do that. So this is just a system of two unknowns. This is just 0. We can ignore it. So let's multiply this equation up here by minus 2 and put it here. So we get minus 2, c1-- I'm just multiplying this times minus 2. We get a 0 here, plus 0 is equal to minus 2x1. And then you add these two. You get 3c2, right? These cancel out. You get 3-- let me write it in a different color. You get 3c2 is equal to x2 minus 2x1. Or divide both sides by 3, you get c2 is equal to 1/3 x2 minus x1. Now we'd have to go substitute back in for c1. But we have this first equation right here, that c1, this first equation that says c1 plus 0 is equal to x1, so c1 is equal to x1. So that one just gets us there. So c1 is equal to x1. So you give me any point in R2-- these are just two real numbers-- and I can just perform this operation, and I'll tell you what weights to apply to a and b to get to that point. If you say, OK, what combination of a and b can get me to the point-- let's say I want to get to the point-- let me go back up here. Oh, it's way up there. Let's say I'm looking to get to the point 2, 2. So x1 is 2. Let me write it down here. Say I'm trying to get to the point the vector 2, 2. What combinations of a and b can be there? Well, I know that c1 is equal to x1, so that's equal to 2, and c2 is equal to 1/3 times 2 minus 2. So 2 minus 2 is 0, so c2 is equal to 0. So if I want to just get to the point 2, 2, I just multiply-- oh, I just realized. This was looking suspicious. I made a slight error here, and this was good that I actually tried it out with real numbers. Over here, when I had 3c2 is equal to x2 minus 2x1, I got rid of this 2 over here. There's a 2 over here. I divide both sides by 3. I get 1/3 times x2 minus 2x1. And that's why I was like, wait, this is looking strange. So I had to take a moment of pause. So let's go to my corrected definition of c2. C2 is equal to 1/3 times x2. So 2 minus 2 times x1, so minus 2 times 2. So it's equal to 1/3 times 2 minus 4, which is equal to minus 2, so it's equal to minus 2/3. So if I multiply 2 times my vector a minus 2/3 times my vector b, I will get to the vector 2, 2. And you can verify it for yourself. 2 times my vector a 1, 2, minus 2/3 times my vector b 0, 3, should equal 2, 2.