If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains ***.kastatic.org** and ***.kasandbox.org** are unblocked.

Main content

Current time:0:00Total duration:5:48

AP.STATS:

VAR‑7 (EU)

, VAR‑7.D (LO)

, VAR‑7.D.1 (EK)

soon ill and his friends have been using a group messaging app for over a year to chat with each other he suspects that on average they send each other more than 100 messages per day sudol takes a random sample of seven days from their chat history and records how many messages were sent on those days the sample data are strongly skewed to the right with a mean of 125 messages and a standard deviation of 44 messages he wants to use these sample data to conduct a t-test about the mean which conditions performed for performing this type of significance tests have been met so let's just think about what's going on here soon l might have some type of a null hypothesis maybe he got this hundred maybe he read a magazine article says that on average the average teen agers sends a hundred text messages per day and so maybe the null hypothesis is that the mean amount of messages per day that he and his friends send which was signified by mu maybe the null is 100 that they're no different than all other teenagers and maybe he suspects and actually they say it right over here his alternative hypothesis would be what he suspects that they spend they'd send more than 100 text messages per day and so what he does is he takes a sample from the population of days and there's over 365 they say they've been using the group messaging app for over a year and he takes seven of those days so n is equal to seven and from that he calculates sample statistics he calculates the sample mean which is trying to estimate the true population mean right over here and he also is able to calculate a sample standard deviation and what you do in a significance test is you say well what is the probability of getting this sample mean or something even more extreme assuming the null hypothesis and if that probability is below a preset threshold then you would reject the null hypothesis and it would suggest the alternative but in order to feel good about that significance test and be able to even calculate that p-value with confidence there are conditions for performing this type of significance test the first is is that this is truly a random sample and that's known as the random condition and you have seen this before when we did significance tests with proportions here we're doing it with means population mean sample mean in the past we did it with population proportion and sample proportion well the random condition it says it right here Sunol takes a random sample of seven days from their chat history they don't say how he did it but we'll just take their word for it that it was a random sample the next condition is sometimes known as the independence independence condition and that's that the individual observations in our sample are roughly independent one way that they would be independent for sure is if Sunil is sampling with a replacement they don't say that but another condition so you either could have replacement sampling with replacement or another way where you could feel that it's roughly independent is if your sample size is less than or equal to 10% of the population now in this situation he took 7 he took a sample size of 7 and then the population of days they says that they've been using the group messaging app for over a year so they've been using it for over 365 days so 7 is for sure less than or equal to 10% of 365 which would be 36 point 5 so we meet this condition which allows us to meet the independence condition now the last condition is often known as the normal condition and this is to feel good that the sampling distribution of the sample means right over here is approximately normal and this is going to be a little bit different than what we saw with significance test when we dealt with proportions there's a few ways to feel good that the sampling distribution of the sample means is normal one is is if the underlying parent population normal so parent parent population normal now they don't tell us anything that there's actually a a normal distribution for the amount of time that they spend on a given day so we don't know this one for sure but sometimes you might another way is to feel good that our sample size is greater than or equal to 30 and this comes from the central limit theorem that then our sampling distribution is going to be roughly normal but we see very clearly our sample size is not greater than or equal to 30 so we don't meet that constraint either now the third way that we could feel good that our sampling distribution of our sample mean is roughly normal is if our sample is if our sample is symmetric symmetric and there are no outliers or maybe even you'd say no significant outliers now is this the case well it says right over here the sample data are strongly skewed to the right with a mean of 125 messages in a standard deviation of 44 messages so this strongly Q skewed to the right it's clearly not a symmetric sample data and so we don't meet any of these can any of these sub conditions for the normal condition and so we do not meet the normal condition for our significance test