Main content

### Course: Statistics and probability > Unit 5

Lesson 2: Correlation coefficients# Correlation coefficient review

The correlation coefficient r measures the direction and strength of a linear relationship. Calculating r is pretty complex, so we usually rely on technology for the computations. We focus on understanding what r says about a scatterplot.

### What is a correlation coefficient?

The correlation coefficient $r$ measures the direction and strength of a linear relationship. Calculating $r$ is pretty complex, so we usually rely on technology for the computations. We focus on understanding what $r$ says about a scatterplot.

Here are some facts about $r$ :

- It always has a value between
and$-1$ .$1$ - Strong positive linear relationships have values of
closer to$r$ .$1$ - Strong negative linear relationships have values of
closer to$r$ .$-1$ - Weaker relationships have values of
closer to$r$ .$0$

Let's look at a few examples:

*Want to learn more about the correlation coefficient? Check out this video.*

### Practice problem

*Want to practice more problems like this? Check out this exercise on correlation coefficient intuition.*

## Want to join the conversation?

- i dont know what im still doing here(35 votes)
- How can we prove that the value of r always lie between 1 and -1 ?(12 votes)
- Weaker relationships have values of r closer to 0. But r = 0 doesn’t mean that there is no relation between the variables, right? I mean, if r = 0 then there is no
**linear**correlation, but we still could have a**non linear**correlation?(6 votes)- Theoretically, yes. The r-value you are referring to is specific to the linear correlation.(7 votes)

- I am taking Algebra 1 not whatever this is but I still chose to do this(4 votes)
- Calculating the correlation coefficient is complex, but is there a way to visually "estimate" it by looking at a scatter plot? Or do we have to use computors for that?(3 votes)
- When it is said that to calculate the correlation coefficient is complex, is this simply because there are a lot of data points at play, or is the math difficult to comprehend for the course level?(1 vote)
- lots of data points. definition is easy to understand.(2 votes)

- What's spearman's correlation coefficient?(1 vote)
- i dont know what im still doing here(1 vote)
- Based on the formula, I think I can imagine examples that give high scores for correlation, but have values that are off. Likewise, I can image examples where the score is lower and everything is on the line. For example, any data point that has an x value near the x mean will be valued at 0, even if the y value is in outer space. Yes, it is still counted as a data point, and so reduces the correlation, but not by much. Likewise, if there is a line with (x,y) at the edges (driving up the standard deviation up), and then the mass of data points, towards the mean in both x and y, that could drive the correlation down. In other words it seems to favor record sets with similar z indexes in the extremities, which to me does not seem exactly like a clean measurement.(1 vote)
- Your observations about the correlation coefficient are valid. The correlation coefficient is influenced by the distribution and spread of data points, as well as their deviations from the means of the variables. While the formula for the correlation coefficient accounts for deviations and standard deviations, it may not capture all aspects of the relationship between variables, especially in non-linear relationships. As you mentioned, extreme data points or outliers can affect the correlation coefficient, potentially leading to misleading interpretations. It's essential to consider the context of the data and interpret the correlation coefficient alongside other statistical measures and graphical representations to gain a comprehensive understanding of the relationship between variables.(1 vote)

- is correlation can only used in two features instead of two clustering of features?(1 vote)