Statistics and probability
- Two-way frequency tables and Venn diagrams
- Two-way frequency tables
- Read two-way frequency tables
- Create two-way frequency tables
- Two-way relative frequency tables
- Create two-way relative frequency tables
- Analyze two-way frequency tables
- Interpreting two-way tables
- Interpret two-way tables
- Categorical data example
- Analyzing trends in categorical data
- Trends in categorical data
- Two-way relative frequency tables and associations
- Two-way tables review
Two-way tables organize data based on two categorical variables.
Two way frequency tables
Two-way frequency tables show how many data points fit in each category.
Here's an example:
The columns of the table tell us whether the student is a male or a female. The rows of the table tell us whether the student prefers dogs, cats, or doesn't have a preference.
Each cell tells us the number (or frequency) of students. For example, the is in the male column and the prefers dogs row. This tells us that there are males who preferred dogs in this dataset.
Notice that there are two variables—gender and preference—this is where the two in two-way frequency table comes from.
Want a review of making two-way frequency tables? Check out this video.
Want to practice making frequency tables? Check out this exercise.
Want to practice reading frequency tables? Check out this exercise
Two way relative frequency tables
Two-way relative frequency tables show what percent of data points fit in each category. We can use row relative frequencies or column relative frequencies, it just depends on the context of the problem.
For example, here's how we would make column relative frequencies:
Step 1: Find the totals for each column.
Step 2: Divide each cell count by its column total and convert to a percentage.
Notice that sometimes your percentages won't add up to even though we rounded properly. This is called round-off error, and we don't worry about it too much.
Two-way relative frequency tables are useful when there are different sample sizes in a dataset. In this example, more females were surveyed than males, so using percentages makes it easier to compare the preferences of males and females. From the relative frequencies, we can see that a large majority of males preferred dogs compared to a minority of females .
Want a review of making two-way relative frequency tables? Check out this video.
Want to practice making relative frequency tables? Check out this exercise.
Want to practice reading relative frequency tables? Check out this exercise
Want to join the conversation?
- Why would someone have the columns add up to 100% instead of having the rows add up to 100%?(4 votes)
- It depends on what you would like to compare. In the example above, if you want to know "Of dog lovers, what proportion are male?" Adding the rows up to 100 would be appropriate. If you wanted to know "Of males, what proportion are dog lovers?" adding the columns up to 100 would be more appropriate.(29 votes)
- even tough I am an experienced engineer, i had to spend some time (more than 4 repeats to get 3/4 score) on the last "Trends in categorical data" practice test. I had to learn tendencies by trial and error; "is it probable? is it more probable?".
and also when to use row or column percentages was a bit dependent on the language itself: "dog lovers among men!" or "men among dog lovers!"
It would be better to give extra information about these during the course to let newcomers learn better.(2 votes)
- I am by no means close to any of you, but yes I agree that language at times is either confusing or too concise, or perhaps is just my inexperience with the material.(7 votes)
- “From the relative frequencies, we can see that a large majority of males preferred dogs (78%) compared to a minority of females (41%)”
I still don’t understand what can and cannot be compared. Since this a column relative frequency table shouldn’t you only be allowed to compare data points that are in the COLUMN? How can you here compare between those two genders as the above quoted statement does?
Shouldn’t you only be able to say that males are more likely to prefer dogs over cats and that females are more likely to prefer cats over dogs?(4 votes)
- I am trying to analyze a two-way table that involves data in the form of scores out of 10 rather than frequencies. How could I analyze this with conditional or marginal distributions?(2 votes)
- Im confused as some question was "what's the probabilty of sth sth in 12th or 9th grade?" why did we have to plus 12th and 9th grade in order to get their probability. shouldn't the answer be the probability of either 12th or 9th grade? OR is a disjunction not a conjunction, so if one statement is correct than another statement wouldn't be matter and the whole proposition is true, wouldn't it?(1 vote)
- Why did we divide by the row totals instead of the column totals in the previous practice section? Here it says to divide by the column totals to get the relative frequency table(1 vote)
- There is said: Sometimes your percentages won't add up to 100% even if we rounded properly. Why? Would you give an example?(1 vote)