If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

Main content

## High school statistics

### Course: High school statistics>Unit 4

Lesson 1: Fitting trend lines to scatterplots

# Line of best fit: smoking in 1945

The scatter plot shows how many adults in America smoked from year to year. We can guess how many smoked in 1945 by drawing a line that slopes down through the points. Then we see how high the line would be 20 years earlier. Created by Sal Khan.

## Want to join the conversation?

• I don't understand this at all... can someone please explain this to me?
(19 votes)
• We have a graph with various data points, and it looks like there is a linear relationship between the data points (because if you squint you can kinda see where a line could go, right in the middle of all the points).

Once you sketch this line, you know (even though you can't see it) that the line goes on forever in both directions. We know that 1965 on the graph is where x=0, and about 41 or 42% of Americans smoked... but we want to know how many Americans smoked in 1945.

Even though the graph doesn't show 1945, we can draw the line backwards (to the left of the y-axis) and estimate the y-value from the graph. In the video (at ) it looks like the y-value is about 51 or 52%.

Hope this helps a little!
(31 votes)
• Is it possible to calculate a perfect line through the points?
(22 votes)
• Only through some points. You can have a perfectly straight line when given only two points, but if there are more than two, most often a perfect line doesn't exist.
(18 votes)
• Is there a way to make the equations easier to understand and do? I am good at drawing the line of best fit, but not the rate of change...
(10 votes)
• Well, the rate of change is a slope which you need when drawing a line of best fit. You're just drawing a line that best fits the data.
(7 votes)
• Is this a factual chart?
(8 votes)
• I was wondering this too, so I looked it up and it's true that 45% of Americans smoked in 1965. What's interesting is that by 2015, the percentage had dropped and only 15% of Americans smoked.
(6 votes)
• we continue the trend like that backwards, then is it possible to show that at some year ~100% population smokes?
(7 votes)
• Assuming the trend stays exactly the same, then yes. You can continue the line for as far back and forward as it can go (from 0% to 100%).
(7 votes)
• what if that estimate were to be a faction? And what would that fraction be?
(8 votes)
• Confusing because it started in 1945
(8 votes)
• are there any standard to how to get the "best" line ? how do you know that is the right line and this is not ?
(6 votes)
• The best line has the most dots going in the same direction, if the line is wrong there would be outliers and you might not be able to use a line at all. I hope this helps!
(3 votes)
• HOw do you approximately caculate the points in the first place.
(6 votes)
• How are you supposed to determine where the line goes exactly? I've been doing some of the practice problems and have gotten every single one wrong because my line wasn't placed exactly where it showed in the hint section, and in result came up with a different answer.
(4 votes)
• To figure out EXACTLY where the line goes, you'd have to check out some of Khan Academy's least square regression line (aka linear regression or LSRL) videos! The least square regression line is much more precise than the line of best fit, but the least square regression line is also MUCH MORE ADVANCED! It's in the AP Statistics curriculum!
(3 votes)

## Video transcript

The graph below shows the percentage of American adults who smoke over time. Assuming the trend shown in the data has been consistent since 1945, use the graph to estimate the percentage of American adults who smoked in 1945. So let's see what's going on here. The horizontal axis here, they say years since 1965. So at this point right over here, this is 0 years since 1965. So this really represents 1965. And we see it looks like around-- let's see, if I were to eyeball it, it looks like it's around 42% of Americans, just looking at this graph. I know that's not an exact number. Roughly 41% or 42% of Americans smoked in 1965 based on this graph. And then five years later, this would be 1970. 10 years later, that would be 1975. And they don't sample the data, or we don't have data from every given year. This is just from some of the years that we happen to have. But what is clear, it looks like we have a negative linear relationship right over here, that it would not be difficult to fit a line. So let me try to do that. So I'm just going to eyeball it and try to fit a line to this data. So our line might look something like that. So it looks like a pretty strong negative linear relationship. When I say it's a negative linear relationship, we see that as time increases, the percentage of smokers in the US is decreasing. So that's what makes it a negative relationship. Now, what are they asking? They want to estimate the percentage of American adults who smoked in 1945. Well, 1945 would be to the left of 0. So we could even think of it as if 1945 is 20 years before 1965. So let me see if I can draw that. So 20 years before 1965. Let's see, this would be 5 years before 1965, 10 years, 15 years, 20 years before 1965. So I could even put that as negative 20 right over here. Negative 20 years since 1965 you could view as 20 years before 1965. So that would represent 1945 right over there. And one thing that we could do is very roughly just try to extend this negative linear relationship backwards. And they allow us to do that by saying assuming the trend shown on the data has been consistent. So the trend has been consistent. This line represents the trend. So let's just keep going backwards, keep going backwards at the same rate, so something like that. I want to make sure that it looks like it's the same rate right over here. And you could just try to eyeball it. You could say, well, let's see, 20 years ago, 1945. If I were to extend that line backwards, it looks like there were about 52% of the population was smoking. It seems like we're about 52% right over here. Another way to think about it would be to actually try to calculate the rate of decline. And let's say we do it over every 20 years, because that will be useful because we're going 20 years back. So if we go 20 years from this point, so this is 1965, you go 20 years in the future. So that is 10 years, and then that is 20 years. So my change in the horizontal is 20 years. What's the change in the vertical? Well, it looks like we have a decrease of a little bit more than 10%. It looks like it's 11% or 12% decrease. So I'll just say minus 11% right there. And let's see if that's consistent. If we were to go another 20 years. So if we go another 20 years, it looks like once again we've gone down by about 10%. So that looks like roughly 10%. If we're following the line, it should actually be the same number. So let me write it this way. It's approximately down 10%. So that little squiggly line, I'm just saying approximately negative 10% every 20 years. So if you go back 20 years, you should increase your percentage by 20%. So this should go up by-- or you should increase your percent by 10%, I should say. So if we started at 41% or 42%, once again, this was what we saw when we just eyeballed it, you should get to 51% or 52%. So my estimate of the percentage of American adults who smoked in 1945 would be 51% or 52%.