null hypothesis two categorical variables

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

8.1 - the chi-square test of independence.

How do we test the independence of two categorical variables? It will be done using the Chi-Square Test of Independence.

As with all prior statistical tests we need to define null and alternative hypotheses. Also, as we have learned, the null hypothesis is what is assumed to be true until we have evidence to go against it. In this lesson, we are interested in researching if two categorical variables are related or associated (i.e., dependent). Therefore, until we have evidence to suggest that they are, we must assume that they are not. This is the motivation behind the hypothesis for the Chi-Square Test of Independence:

  • \(H_0\): In the population, the two categorical variables are independent.
  • \(H_a\): In the population, the two categorical variables are dependent.

Note! There are several ways to phrase these hypotheses. Instead of using the words "independent" and "dependent" one could say "there is no relationship between the two categorical variables" versus "there is a relationship between the two categorical variables." Or "there is no association between the two categorical variables" versus "there is an association between the two variables." The important part is that the null hypothesis refers to the two categorical variables not being related while the alternative is trying to show that they are related.

Once we have gathered our data, we summarize the data in the two-way contingency table. This table represents the observed counts and is called the Observed Counts Table or simply the Observed Table. The contingency table on the introduction page to this lesson represented the observed counts of the party affiliation and opinion for those surveyed.

The question becomes, "How would this table look if the two variables were not related?" That is, under the null hypothesis that the two variables are independent, what would we expect our data to look like?

Consider the following table:

The total count is \(A+B+C+D\). Let's focus on one cell, say Group 1 and Success with observed count A. If we go back to our probability lesson, let \(G_1\) denote the event 'Group 1' and \(S\) denote the event 'Success.' Then,

\(P(G_1)=\dfrac{A+B}{A+B+C+D}\) and \(P(S)=\dfrac{A+C}{A+B+C+D}\).

Recall that if two events are independent, then their intersection is the product of their respective probabilities. In other words, if \(G_1\) and \(S\) are independent, then...

\begin{align} P(G_1\cap S)&=P(G_1)P(S)\\&=\left(\dfrac{A+B}{A+B+C+D}\right)\left(\dfrac{A+C}{A+B+C+D}\right)\\[10pt] &=\dfrac{(A+B)(A+C)}{(A+B+C+D)^2}\end{align}

If we considered counts instead of probabilities, then we get the count by multiplying the probability by the total count. In other words...

\begin{align} \text{Expected count for cell with A} &=P(G_1)P(S)\  x\  (\text{total count}) \\   &= \left(\dfrac{(A+B)(A+C)}{(A+B+C+D)^2}\right)(A+B+C+D)\\[10pt]&=\mathbf{\dfrac{(A+B)(A+C)}{A+B+C+D}} \end{align}

This is the count we would expect to see if the two variables were independent (i.e. assuming the null hypothesis is true).

The expected count for each cell under the null hypothesis is:

\(E=\dfrac{\text{(row total)}(\text{column total})}{\text{total sample size}}\)

Example 8-1: Political Affiliation and Opinion Section  

To demonstrate, we will use the Party Affiliation and Opinion on Tax Reform example.

Observed Table:

Find the expected counts for all of the cells.

We need to find what is called the Expected Counts Table or simply the Expected Table. This table displays what the counts would be for our sample data if there were no association between the variables.

Calculating Expected Counts from Observed Counts

Chi-Square Test Statistic Section  

To better understand what these expected counts represent, first recall that the expected counts table is designed to reflect what the sample data counts would be if the two variables were independent. Taking what we know of independent events, we would be saying that the sample counts should show similarity in opinions of tax reform between democrats and republicans. If you find the proportion of each cell by taking a cell's expected count divided by its row total, you will discover that in the expected table each opinion proportion is the same for democrats and republicans. That is, from the expected counts, 0.404 of the democrats and 0.404 of the republicans favor the bill; 0.3 of the democrats and 0.3 of the republicans are indifferent; and 0.296 of the democrats and 0.296 of the republicans are opposed.

The statistical question becomes, "Are the observed counts so different from the expected counts that we can conclude a relationship exists between the two variables?" To conduct this test we compute a Chi-Square test statistic where we compare each cell's observed count to its respective expected count.

In a summary table, we have \(r\times c=rc\) cells. Let \(O_1, O_2, …, O_{rc}\) denote the observed counts for each cell and \(E_1, E_2, …, E_{rc}\) denote the respective expected counts for each cell.

The Chi-Square test statistic is calculated as follows:

\(\chi^{2*}=\frac{(O_1-E_1)^2}{E_1}+\frac{(O_2-E_2)^2}{E_2}+...+\frac{(O_{rc}-E_{rc})^2}{E_{rc}}=\overset{rc}{ \underset{i=1}{\sum}}\frac{(O_i-E_i)^2}{E_i}\)

Under the null hypothesis and certain conditions (discussed below), the test statistic follows a Chi-Square distribution with degrees of freedom equal to \((r-1)(c-1)\), where \(r\) is the number of rows and \(c\) is the number of columns. We leave out the mathematical details to show why this test statistic is used and why it follows a Chi-Square distribution.

As we have done with other statistical tests, we make our decision by either comparing the value of the test statistic to a critical value (rejection region approach) or by finding the probability of getting this test statistic value or one more extreme (p-value approach).

The critical value for our Chi-Square test is \(\chi^2_{\alpha}\) with degree of freedom =\((r - 1) (c -1)\), while the p-value is found by \(P(\chi^2>\chi^{2*})\) with degrees of freedom =\((r - 1)(c - 1)\).

Example 8-1 Cont'd: Chi-Square Section  

Let's apply the Chi-Square Test of Independence to our example where we have a random sample of 500 U.S. adults who are questioned regarding their political affiliation and opinion on a tax reform bill. We will test if the political affiliation and their opinion on a tax reform bill are dependent at a 5% level of significance. Calculate the test statistic.

  • Using Minitab

The contingency table ( political_affiliation.csv ) is given below. Each cell contains the observed count and the expected count in parentheses. For example, there were 138 democrats who favored the tax bill. The expected count under the null hypothesis is 115.14. Therefore, the cell is displayed as 138 (115.14).

Calculating the test statistic by hand:

\begin{multline} \chi^{2*}=\dfrac{(138−115.14)^2}{115.14}+\dfrac{(83−85.50)^2}{85.50}+\dfrac{(64−84.36)^2}{84.36}+\\ \dfrac{(64−86.86)^2}{86.86}+\dfrac{(67−64.50)^2}{64.50}+\dfrac{(84−63.64)^2}{63.64}=22.152\end{multline}

...with degrees for freedom equal to \((2 - 1)(3 - 1) = 2\).

  Minitab: Chi-Square Test of Independence

To perform the Chi-Square test in Minitab...

  • Choose Stat  >  Tables  >  Chi-Square Test for Association
  • If you have summarized data (i.e., observed count) from the drop-down box 'Summarized data in a two-way table.' Select and enter the columns that contain the observed counts, otherwise, if you have the raw data use 'Raw data' (categorical variables). Note that if using the raw data your data will need to consist of two columns: one with the explanatory variable data (goes in the 'row' field) and the response variable data (goes in the 'column' field).
  • Labeling (Optional) When using the summarized data you can label the rows and columns if you have the variable labels in columns of the worksheet. For example, if we have a column with the two political party affiliations and a column with the three opinion choices we could use these columns to label the output.
  • Click the Statistics  tab. Keep checked the four boxes already checked, but also check the box for 'Each cell's contribution to the chi-square.' Click OK .

Note! If you have the observed counts in a table, you can copy/paste them into Minitab. For instance, you can copy the entire observed counts table (excluding the totals!) for our example and paste these into Minitab starting with the first empty cell of a column.

The following is the Minitab output for this example.

Cell Contents: Count, Expected count, Contribution to Chi-Square

Pearson Chi-Sq = 4.5386 + 0.073 + 4.914 + 6.016 + 0.097 + 6.5137 = 22.152 DF = 2, P-Value = 0.000

Likelihood Ratio Chi-Square

The Chi-Square test statistic is 22.152 and calculated by summing all the individual cell's Chi-Square contributions:

\(4.584 + 0.073 + 4.914 + 6.016 + 0.097 + 6.532 = 22.152\)

The p-value is found by \(P(X^2>22.152)\) with degrees of freedom =\((2-1)(3-1) = 2\).  

Minitab calculates this p-value to be less than 0.001 and reports it as 0.000. Given this p-value of 0.000 is less than the alpha of 0.05, we reject the null hypothesis that political affiliation and their opinion on a tax reform bill are independent. We conclude that there is evidence that the two variables are dependent (i.e., that there is an association between the two variables).

Conditions for Using the Chi-Square Test Section  

Exercise caution when there are small expected counts. Minitab will give a count of the number of cells that have expected frequencies less than five. Some statisticians hesitate to use the Chi-Square test if more than 20% of the cells have expected frequencies below five, especially if the p-value is small and these cells give a large contribution to the total Chi-Square value.

Example 8-2: Tire Quality Section  

The operations manager of a company that manufactures tires wants to determine whether there are any differences in the quality of work among the three daily shifts. She randomly selects 496 tires and carefully inspects them. Each tire is either classified as perfect, satisfactory, or defective, and the shift that produced it is also recorded. The two categorical variables of interest are the shift and condition of the tire produced. The data ( shift_quality.txt ) can be summarized by the accompanying two-way table. Does the data provide sufficient evidence at the 5% significance level to infer that there are differences in quality among the three shifts?

Chi-Square Test

Chi-Sq = 8.647 DF = 4, P-Value = 0.071 

Note that there are 3 cells with expected counts less than 5.0.

In the above example, we don't have a significant result at a 5% significance level since the p-value (0.071) is greater than 0.05. Even if we did have a significant result, we still could not trust the result, because there are 3 (33.3% of) cells with expected counts < 5.0

Sometimes researchers will categorize quantitative data (e.g., take height measurements and categorize as 'below average,' 'average,' and 'above average.') Doing so results in a loss of information - one cannot do the reverse of taking the categories and reproducing the raw quantitative measurements. Instead of categorizing, the data should be analyzed using quantitative methods.

Try it! Section  

A food services manager for a baseball park wants to know if there is a relationship between gender (male or female) and the preferred condiment on a hot dog. The following table summarizes the results. Test the hypothesis with a significance level of 10%.

The hypotheses are:

  • \(H_0\): Gender and condiments are independent
  • \(H_a\): Gender and condiments are not independent

We need to expected counts table:

None of the expected counts in the table are less than 5. Therefore, we can proceed with the Chi-Square test.

The test statistic is:

\(\chi^{2*}=\frac{(15-19.2)^2}{19.2}+\frac{(23-20.16)^2}{20.16}+...+\frac{(8-9.36)^2}{9.36}=2.95\)

The p-value is found by \(P(\chi^2>\chi^{2*})=P(\chi^2>2.95)\) with (3-1)(2-1)=2 degrees of freedom. Using a table or software, we find the p-value to be 0.2288.

With a p-value greater than 10%, we can conclude that there is not enough evidence in the data to suggest that gender and preferred condiment are related.

Statology

Statistics Made Easy

Chi-Square Test of Independence: Definition, Formula, and Example

A Chi-Square Test of Independence  is used to determine whether or not there is a significant association between two categorical variables.

This tutorial explains the following:

  • The motivation for performing a Chi-Square Test of Independence.
  • The formula to perform a Chi-Square Test of Independence.
  • An example of how to perform a Chi-Square Test of Independence.

Chi-Square Test of Independence: Motivation

A Chi-Square test of independence can be used to determine if there is an association between two categorical variables in a many different settings. Here are a few examples:

  • We want to know if gender is associated with political party preference so we survey 500 voters and record their gender and political party preference.
  • We want to know if a person’s favorite color is associated with their favorite sport so we survey 100 people and ask them about their preferences for both.
  • We want to know if education level and marital status are associated so we collect data about these two variables on a simple random sample of 50 people.

In each of these scenarios we want to know if two categorical variables are associated with each other. In each scenario, we can use a Chi-Square test of independence to determine if there is a statistically significant association between the variables. 

Chi-Square Test of Independence: Formula

A Chi-Square test of independence uses the following null and alternative hypotheses:

  • H 0 : (null hypothesis)  The two variables are independent.
  • H 1 : (alternative hypothesis)  The two variables are not independent. (i.e. they are associated)

We use the following formula to calculate the Chi-Square test statistic X 2 :

X 2 = Σ(O-E) 2  / E

  • Σ:  is a fancy symbol that means “sum”
  • O:  observed value
  • E:  expected value

If the p-value that corresponds to the test statistic X 2  with (#rows-1)*(#columns-1) degrees of freedom is less than your chosen significance level then you can reject the null hypothesis.

Chi-Square Test of Independence: Example

Suppose we want to know whether or not gender is associated with political party preference. We take a simple random sample of 500 voters and survey them on their political party preference. The following table shows the results of the survey:

Use the following steps to perform a Chi-Square test of independence to determine if gender is associated with political party preference.

Step 1: Define the hypotheses.

We will perform the Chi-Square test of independence using the following hypotheses:

  • H 0 :  Gender and political party preference are independent.
  • H 1 : Gender and political party preference are  not independent.

Step 2: Calculate the expected values.

Next, we will calculate the expected values for each cell in the contingency table using the following formula:

Expected value = (row sum * column sum) / table sum.

For example, the expected value for Male Republicans is: (230*250) / 500 =  115 .

We can repeat this formula to obtain the expected value for each cell in the table:

Step 3: Calculate (O-E) 2  / E for each cell in the table.

Next we will calculate  (O-E) 2  / E  for each cell in the table  where:

For example, Male Republicans would have a value of: (120-115) 2 /115 =  0.2174 .

We can repeat this formula for each cell in the table:

Step 4: Calculate the test statistic X 2  and the corresponding p-value.

X 2  = Σ(O-E) 2  / E = 0.2174 + 0.2174 + 0.0676 + 0.0676 + 0.1471 + 0.1471 =  0.8642

According to the Chi-Square Score to P Value Calculator , the p-value associated with X 2  = 0.8642 and (2-1)*(3-1) = 2 degrees of freedom is  0.649198 .

Step 5: Draw a conclusion.

Since this p-value is not less than 0.05, we fail to reject the null hypothesis. This means we do not have sufficient evidence to say that there is an association between gender and political party preference.

Note:  You can also perform this entire test by simply using the Chi-Square Test of Independence Calculator .

Additional Resources

The following tutorials explain how to perform a Chi-Square test of independence using different statistical programs:

How to Perform a Chi-Square Test of Independence in Stata How to Perform a Chi-Square Test of Independence in Excel How to Perform a Chi-Square Test of Independence in SPSS How to Perform a Chi-Square Test of Independence in Python How to Perform a Chi-Square Test of Independence in R Chi-Square Test of Independence on a TI-84 Calculator Chi-Square Test of Independence Calculator

null hypothesis two categorical variables

Hey there. My name is Zach Bobbitt. I have a Master of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

SPSS tutorials website header logo

Chi-Square Independence Test – What and Why?

Chi-square independence test - what is it.

  • Null Hypothesis
  • Assumptions

Test Statistic

Effect size.

The chi-square independence test evaluates if two categorical variables are related in some population. Example: a scientist wants to know if education level and marital status are related for all people in some country. He collects data on a simple random sample of n = 300 people, part of which are shown below.

Chi-Square Test - Raw Data View

Chi-Square Test - Observed Frequencies

A good first step for these data is inspecting the contingency table of marital status by education. Such a table -shown below- displays the frequency distribution of marital status for each education category separately. So let's take a look at it.

Chi-Square Test - Contingency Table

The numbers in this table are known as the observed frequencies . They tell us an awful lot about our data. For instance,

  • there's 4 marital status categories and 5 education levels;
  • we succeeded in collecting data on our entire sample of n = 300 respondents (bottom right cell);
  • we've 84 respondents with a Bachelor’s degree (bottom row, middle);
  • we've 30 divorced respondents (last column, middle);
  • we've 9 divorced respondents with a Bachelor’s degree.

Chi-Square Test - Column Percentages

Although our contingency table is a great starting point, it doesn't really show us if education level and marital status are related. This question is answered more easily from a slightly different table as shown below.

Chi-Square Test - Column Percentages

This table shows -for each education level separately- the percentages of respondents that fall into each marital status category. Before reading on, take a careful look at this table and tell me is marital status related to education level and -if so- how? If we inspect the first row, we see that 46% of respondents with middle school never married. If we move rightwards (towards higher education levels), we see this percentage decrease: only 18% of respondents with a PhD degree never married (top right cell).

Reversely, note that 64% of PhD respondents are married (second row). If we move towards the lower education levels (leftwards), we see this percentage decrease to 31% for respondents having just middle school. In short, more highly educated respondents marry more often than less educated respondents.

Chi-Square Test - Stacked Bar Chart

Our last table shows a relation between marital status and education. This becomes much clearer by visualizing this table as a stacked bar chart , shown below.

Chi-Square Independence Test - Stacked Bar Chart Showing Dependence

If we move from top to bottom (highest to lowest education) in this chart, we see the dark blue bar (never married) increase. Marital status is clearly associated with education level. The lower someone’s education, the smaller the chance he’s married. That is: education “says something” about marital status (and reversely) in our sample. So what about the population?

Chi-Square Test - Null Hypothesis

The null hypothesis for a chi-square independence test is that two categorical variables are independent in some population. Now, marital status and education are related -thus not independent- in our sample. However, we can't conclude that this holds for our entire population. The basic problem is that samples usually differ from populations.

If marital status and education are perfectly independent in our population, we may still see some relation in our sample by mere chance. However, a strong relation in a large sample is extremely unlikely and hence refutes our null hypothesis. In this case we'll conclude that the variables were not independent in our population after all.

So exactly how strong is this dependence -or association- in our sample? And what's the probability -or p-value - of finding it if the variables are (perfectly) independent in the entire population?

Chi-Square Test - Statistical Independence

Before we continue, let's first make sure we understand what “independence” really means in the first place. In short, independence means that one variable doesn't “say anything” about another variable. A different way of saying the exact same thing is that independence means that the relative frequencies of one variable are identical over all levels of some other variable. Uh... say again? Well, what if we had found the chart below?

Chi-Square Independence Test - Stacked Bar Chart Showing Statistical Independence

What does education “say about” marital status? Absolutely nothing! Why? Because the frequency distributions of marital status are identical over education levels: no matter the education level, the probability of being married is 50% and the probability of never being married is 30%.

In this chart, education and marital status are perfectly independent . The hypothesis of independence tells us which frequencies we should have found in our sample: the expected frequencies.

Expected Frequencies

Expected frequencies are the frequencies we expect in a sample if the null hypothesis holds. If education and marital status are independent in our population, then we expect this in our sample too. This implies the contingency table -holding expected frequencies- shown below.

Chi-Square Test - Expected Frequencies

These expected frequencies are calculated as $$eij = \frac{oi\cdot oj}{N}$$ where

  • \(eij\) is an expected frequency;
  • \(oi\) is a marginal column frequency;
  • \(oj\) is a marginal row frequency;
  • \(N\) is the total sample size.

So for our first cell, that'll be $$eij = \frac{39 \cdot 90}{300} = 11.7$$ and so on. But let's not bother too much as our software will take care of all this.

Note that many expected frequencies are non integers . For instance, 11.7 respondents with middle school who never married. Although there's no such thing as “11.7 respondents” in the real world, such non integer frequencies are just fine mathematically. So at this point, we've 2 contingency tables:

  • a contingency table with observed frequencies we found in our sample;
  • a contingency table with expected frequencies we should have found in our sample if the variables are really independent.

The screenshot below shows both tables in this GoogleSheet (read-only). This sheet demonstrates all formulas that are used for this test.

Observed and Expected Frequencies in GoogleSheet

Insofar as the observed and expected frequencies differ, our data deviate more from independence. So how much do they differ? First off, we subtract each expected frequency from each observed frequency, resulting in a residual . That is, $$rij = oij - eij$$ For our example, this results in (5 * 4 =) 20 residuals. Larger (absolute) residuals indicate a larger difference between our data and the null hypothesis. We basically add up all residuals , resulting in a single number: the χ 2 (pronounce “chi-square”) test statistic.

The chi-square test statistic is calculated as $$\chi^2 = \Sigma{\frac{(oij - eij)^2}{eij}}$$ so for our data $$\chi^2 = \frac{(18 - 11.7)^2}{11.7} + \frac{(36 - 27)^2}{27} + ... + \frac{(6 - 5.4)^2}{5.4} = 23.57$$ Again, our software will take care of all this. But if you'd like to see the calculations, take a look at this GoogleSheet .

Chi-Square Test Statistic in GoogleSheet

So χ 2 = 23.57 in our sample. This number summarizes the difference between our data and our independence hypothesis. Is 23.57 a large value? What's the probability of finding this? Well, we can calculate it from its sampling distribution but this requires a couple of assumptions.

Chi-Square Test Assumptions

The assumptions for a chi-square independence test are

  • independent observations . This usually -not always- holds if each case in SPSS holds a unique person or other statistical unit. Since this is the case for our data, we'll assume this has been met.
  • For a 2 by 2 table , all expected frequencies > 5. However, for a 2 by 2 table, a z-test for 2 independent proportions is preferred over the chi-square test. For a larger table , all expected frequencies > 1 and no more than 20% of all cells may have expected frequencies < 5.

If these assumptions hold, our χ 2 test statistic follows a χ 2 distribution. It's this distribution that tells us the probability of finding χ 2 > 23.57.

Chi-Square Test - Degrees of Freedom

We'll get the p-value we're after from the chi-square distribution if we give it 2 numbers:

  • the χ 2 value (23.57) and
  • the degrees of freedom (df).

The degrees of freedom is basically a number that determines the exact shape of our distribution. The figure below illustrates this point.

Chi-Square Distributions with Different DF

Right. Now, degrees of freedom -or df- are calculated as $$df = (i - 1) \cdot (j - 1)$$ where

  • \(i\) is the number of rows in our contingency table and
  • \(j\) is the number of columns

so in our example $$df = (5 - 1) \cdot (4 - 1) = 12.$$ And with df = 12, the probability of finding χ 2 ≥ 23.57 ≈ 0.023. We simply look this up in SPSS or other appropriate software. This is our 1-tailed significance . It basically means, there's a 0.023 (or 2.3%) chance of finding this association in our sample if it is zero in our population.

Chi-Square Distribution with 1-Tailed P-Value

Since this is a small chance, we no longer believe our null hypothesis of our variables being independent in our population. Conclusion: marital status and education are related in our population. Now, keep in mind that our p-value of 0.023 only tells us that the association between our variables is probably not zero. It doesn't say anything about the strength of this association: the effect size.

For the effect size of a chi-square independence test, consult the appropriate association measure. If at least one nominal variable is involved, that'll usually be Cramér’s V (a sort of Pearson correlation for categorical variables). In our example Cramér’s V = 0.162. Since Cramér’s V takes on values between 0 and 1, 0.162 indicates a very weak association. If both variables had been ordinal, Kendall’s tau or a Spearman correlation would have been suitable as well.

For reporting our results in APA style, we may write something like “An association between education and marital status was observed, χ 2 (12) = 23.57, p = 0.023.”

Chi-Square Independence Test - Software

You can run a chi-square independence test in Excel or Google Sheets but you probably want to use a more user friendly package such as

The figure below shows the output for our example generated by SPSS.

Chi-Square Test - SPSS Output

For a full tutorial (using a different example), see SPSS Chi-Square Independence Test .

Thanks for reading!

Tell us what you think!

This tutorial has 83 comments:.

null hypothesis two categorical variables

By Ruben Geert van den Berg on October 30th, 2022

Thanks for the compliment!

Please note that you can print any web page to a .pdf file from your web browser (Edge/Firefox/Safari/Chrome).

Best regards,

SPSS tutorials

null hypothesis two categorical variables

By Guellati on November 8th, 2022

null hypothesis two categorical variables

By Chaquitta Carr on November 10th, 2022

This is awesome, but I have to read it a couple times because it is still quite overwhelming. You guys broke this down in a language I can at least begin to process, understand, and apply.

Privacy Overview

Teach yourself statistics

Chi-Square Test of Independence

This lesson explains how to conduct a chi-square test for independence . The test is applied when you have two categorical variables from a single population. It is used to determine whether there is a significant association between the two variables.

For example, in an election survey, voters might be classified by gender (male or female) and voting preference (Democrat, Republican, or Independent). We could use a chi-square test for independence to determine whether gender is related to voting preference. The sample problem at the end of the lesson considers this example.

When to Use Chi-Square Test for Independence

The test procedure described in this lesson is appropriate when the following conditions are met:

  • The sampling method is simple random sampling .
  • The variables under study are each categorical .
  • If sample data are displayed in a contingency table , the expected frequency count for each cell of the table is at least 5.

This approach consists of four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results.

State the Hypotheses

Suppose that Variable A has r levels, and Variable B has c levels. The null hypothesis states that knowing the level of Variable A does not help you predict the level of Variable B. That is, the variables are independent.

H o : Variable A and Variable B are independent.

H a : Variable A and Variable B are not independent.

The alternative hypothesis is that knowing the level of Variable A can help you predict the level of Variable B.

Note: Support for the alternative hypothesis suggests that the variables are related; but the relationship is not necessarily causal, in the sense that one variable "causes" the other.

Formulate an Analysis Plan

The analysis plan describes how to use sample data to accept or reject the null hypothesis. The plan should specify the following elements.

  • Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or 0.10; but any value between 0 and 1 can be used.
  • Test method. Use the chi-square test for independence to determine whether there is a significant relationship between two categorical variables.

Analyze Sample Data

Using sample data, find the degrees of freedom, expected frequencies, test statistic, and the P-value associated with the test statistic. The approach described in this section is illustrated in the sample problem at the end of this lesson.

DF = (r - 1) * (c - 1)

E r,c = (n r * n c ) / n

Χ 2 = Σ [ (O r,c - E r,c ) 2 / E r,c ]

  • P-value. The P-value is the probability of observing a sample statistic as extreme as the test statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution Calculator to assess the probability associated with the test statistic. Use the degrees of freedom computed above.

Interpret Results

If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the significance level , and rejecting the null hypothesis when the P-value is less than the significance level.

Test Your Understanding

A public opinion poll surveyed a simple random sample of 1000 voters. Respondents were classified by gender (male or female) and by voting preference (Republican, Democrat, or Independent). Results are shown in the contingency table below.

Is there a gender gap? Do the men's voting preferences differ significantly from the women's preferences? Use a 0.05 level of significance.

The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:

H o : Gender and voting preferences are independent.

H a : Gender and voting preferences are not independent.

  • Formulate an analysis plan . For this analysis, the significance level is 0.05. Using sample data, we will conduct a chi-square test for independence .

DF = (r - 1) * (c - 1) = (2 - 1) * (3 - 1) = 2

E r,c = (n r * n c ) / n E 1,1 = (400 * 450) / 1000 = 180000/1000 = 180 E 1,2 = (400 * 450) / 1000 = 180000/1000 = 180 E 1,3 = (400 * 100) / 1000 = 40000/1000 = 40 E 2,1 = (600 * 450) / 1000 = 270000/1000 = 270 E 2,2 = (600 * 450) / 1000 = 270000/1000 = 270 E 2,3 = (600 * 100) / 1000 = 60000/1000 = 60

Χ 2 = Σ [ (O r,c - E r,c ) 2 / E r,c ] Χ 2 = (200 - 180) 2 /180 + (150 - 180) 2 /180 + (50 - 40) 2 /40     + (250 - 270) 2 /270 + (300 - 270) 2 /270 + (50 - 60) 2 /60 Χ 2 = 400/180 + 900/180 + 100/40 + 400/270 + 900/270 + 100/60 Χ 2 = 2.22 + 5.00 + 2.50 + 1.48 + 3.33 + 1.67 = 16.2

where DF is the degrees of freedom, r is the number of levels of gender, c is the number of levels of the voting preference, n r is the number of observations from level r of gender, n c is the number of observations from level c of voting preference, n is the number of observations in the sample, E r,c is the expected frequency count when gender is level r and voting preference is level c , and O r,c is the observed frequency count when gender is level r voting preference is level c .

The P-value is the probability that a chi-square statistic having 2 degrees of freedom is more extreme than 16.2. We use the Chi-Square Distribution Calculator to find P(Χ 2 > 16.2) = 0.0003.

  • Interpret results . Since the P-value (0.0003) is less than the significance level (0.05), we cannot accept the null hypothesis. Thus, we conclude that there is a relationship between gender and voting preference.

Note: If you use this approach on an exam, you may also want to mention why this approach is appropriate. Specifically, the approach is appropriate because the sampling method was simple random sampling, the variables under study were categorical, and the expected frequency count was at least 5 in each cell of the contingency table.

Hypothesis Testing - Chi Squared Test

Lisa Sullivan, PhD

Professor of Biostatistics

Boston University School of Public Health

Introductory word scramble

Introduction

This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. The hypothesis is based on available information and the investigator's belief about the population parameters. The specific tests considered here are called chi-square tests and are appropriate when the outcome is discrete (dichotomous, ordinal or categorical). For example, in some clinical trials the outcome is a classification such as hypertensive, pre-hypertensive or normotensive. We could use the same classification in an observational study such as the Framingham Heart Study to compare men and women in terms of their blood pressure status - again using the classification of hypertensive, pre-hypertensive or normotensive status.  

The technique to analyze a discrete outcome uses what is called a chi-square test. Specifically, the test statistic follows a chi-square probability distribution. We will consider chi-square tests here with one, two and more than two independent comparison groups.

Learning Objectives

After completing this module, the student will be able to:

  • Perform chi-square tests by hand
  • Appropriately interpret results of chi-square tests
  • Identify the appropriate hypothesis testing procedure based on type of outcome variable and number of samples

Tests with One Sample, Discrete Outcome

Here we consider hypothesis testing with a discrete outcome variable in a single population. Discrete variables are variables that take on more than two distinct responses or categories and the responses can be ordered or unordered (i.e., the outcome can be ordinal or categorical). The procedure we describe here can be used for dichotomous (exactly 2 response options), ordinal or categorical discrete outcomes and the objective is to compare the distribution of responses, or the proportions of participants in each response category, to a known distribution. The known distribution is derived from another study or report and it is again important in setting up the hypotheses that the comparator distribution specified in the null hypothesis is a fair comparison. The comparator is sometimes called an external or a historical control.   

In one sample tests for a discrete outcome, we set up our hypotheses against an appropriate comparator. We select a sample and compute descriptive statistics on the sample data. Specifically, we compute the sample size (n) and the proportions of participants in each response

Test Statistic for Testing H 0 : p 1 = p 10 , p 2 = p 20 , ..., p k = p k0

We find the critical value in a table of probabilities for the chi-square distribution with degrees of freedom (df) = k-1. In the test statistic, O = observed frequency and E=expected frequency in each of the response categories. The observed frequencies are those observed in the sample and the expected frequencies are computed as described below. χ 2 (chi-square) is another probability distribution and ranges from 0 to ∞. The test above statistic formula above is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories.  

When we conduct a χ 2 test, we compare the observed frequencies in each response category to the frequencies we would expect if the null hypothesis were true. These expected frequencies are determined by allocating the sample to the response categories according to the distribution specified in H 0 . This is done by multiplying the observed sample size (n) by the proportions specified in the null hypothesis (p 10 , p 20 , ..., p k0 ). To ensure that the sample size is appropriate for the use of the test statistic above, we need to ensure that the following: min(np 10 , n p 20 , ..., n p k0 ) > 5.  

The test of hypothesis with a discrete outcome measured in a single sample, where the goal is to assess whether the distribution of responses follows a known distribution, is called the χ 2 goodness-of-fit test. As the name indicates, the idea is to assess whether the pattern or distribution of responses in the sample "fits" a specified population (external or historical) distribution. In the next example we illustrate the test. As we work through the example, we provide additional details related to the use of this new test statistic.  

A University conducted a survey of its recent graduates to collect demographic and health information for future planning purposes as well as to assess students' satisfaction with their undergraduate experiences. The survey revealed that a substantial proportion of students were not engaging in regular exercise, many felt their nutrition was poor and a substantial number were smoking. In response to a question on regular exercise, 60% of all graduates reported getting no regular exercise, 25% reported exercising sporadically and 15% reported exercising regularly as undergraduates. The next year the University launched a health promotion campaign on campus in an attempt to increase health behaviors among undergraduates. The program included modules on exercise, nutrition and smoking cessation. To evaluate the impact of the program, the University again surveyed graduates and asked the same questions. The survey was completed by 470 graduates and the following data were collected on the exercise question:

Based on the data, is there evidence of a shift in the distribution of responses to the exercise question following the implementation of the health promotion campaign on campus? Run the test at a 5% level of significance.

In this example, we have one sample and a discrete (ordinal) outcome variable (with three response options). We specifically want to compare the distribution of responses in the sample to the distribution reported the previous year (i.e., 60%, 25%, 15% reporting no, sporadic and regular exercise, respectively). We now run the test using the five-step approach.  

  • Step 1. Set up hypotheses and determine level of significance.

The null hypothesis again represents the "no change" or "no difference" situation. If the health promotion campaign has no impact then we expect the distribution of responses to the exercise question to be the same as that measured prior to the implementation of the program.

H 0 : p 1 =0.60, p 2 =0.25, p 3 =0.15,  or equivalently H 0 : Distribution of responses is 0.60, 0.25, 0.15  

H 1 :   H 0 is false.          α =0.05

Notice that the research hypothesis is written in words rather than in symbols. The research hypothesis as stated captures any difference in the distribution of responses from that specified in the null hypothesis. We do not specify a specific alternative distribution, instead we are testing whether the sample data "fit" the distribution in H 0 or not. With the χ 2 goodness-of-fit test there is no upper or lower tailed version of the test.

  • Step 2. Select the appropriate test statistic.  

The test statistic is:

We must first assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=470 and the proportions specified in the null hypothesis are 0.60, 0.25 and 0.15. Thus, min( 470(0.65), 470(0.25), 470(0.15))=min(282, 117.5, 70.5)=70.5. The sample size is more than adequate so the formula can be used.

  • Step 3. Set up decision rule.  

The decision rule for the χ 2 test depends on the level of significance and the degrees of freedom, defined as degrees of freedom (df) = k-1 (where k is the number of response categories). If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. Critical values can be found in a table of probabilities for the χ 2 distribution. Here we have df=k-1=3-1=2 and a 5% level of significance. The appropriate critical value is 5.99, and the decision rule is as follows: Reject H 0 if χ 2 > 5.99.

  • Step 4. Compute the test statistic.  

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) and the expected frequencies into the formula for the test statistic identified in Step 2. The computations can be organized as follows.

Notice that the expected frequencies are taken to one decimal place and that the sum of the observed frequencies is equal to the sum of the expected frequencies. The test statistic is computed as follows:

  • Step 5. Conclusion.  

We reject H 0 because 8.46 > 5.99. We have statistically significant evidence at α=0.05 to show that H 0 is false, or that the distribution of responses is not 0.60, 0.25, 0.15.  The p-value is p < 0.005.  

In the χ 2 goodness-of-fit test, we conclude that either the distribution specified in H 0 is false (when we reject H 0 ) or that we do not have sufficient evidence to show that the distribution specified in H 0 is false (when we fail to reject H 0 ). Here, we reject H 0 and concluded that the distribution of responses to the exercise question following the implementation of the health promotion campaign was not the same as the distribution prior. The test itself does not provide details of how the distribution has shifted. A comparison of the observed and expected frequencies will provide some insight into the shift (when the null hypothesis is rejected). Does it appear that the health promotion campaign was effective?  

Consider the following: 

If the null hypothesis were true (i.e., no change from the prior year) we would have expected more students to fall in the "No Regular Exercise" category and fewer in the "Regular Exercise" categories. In the sample, 255/470 = 54% reported no regular exercise and 90/470=19% reported regular exercise. Thus, there is a shift toward more regular exercise following the implementation of the health promotion campaign. There is evidence of a statistical difference, is this a meaningful difference? Is there room for improvement?

The National Center for Health Statistics (NCHS) provided data on the distribution of weight (in categories) among Americans in 2002. The distribution was based on specific values of body mass index (BMI) computed as weight in kilograms over height in meters squared. Underweight was defined as BMI< 18.5, Normal weight as BMI between 18.5 and 24.9, overweight as BMI between 25 and 29.9 and obese as BMI of 30 or greater. Americans in 2002 were distributed as follows: 2% Underweight, 39% Normal Weight, 36% Overweight, and 23% Obese. Suppose we want to assess whether the distribution of BMI is different in the Framingham Offspring sample. Using data from the n=3,326 participants who attended the seventh examination of the Offspring in the Framingham Heart Study we created the BMI categories as defined and observed the following:

  • Step 1.  Set up hypotheses and determine level of significance.

H 0 : p 1 =0.02, p 2 =0.39, p 3 =0.36, p 4 =0.23     or equivalently

H 0 : Distribution of responses is 0.02, 0.39, 0.36, 0.23

H 1 :   H 0 is false.        α=0.05

The formula for the test statistic is:

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=3,326 and the proportions specified in the null hypothesis are 0.02, 0.39, 0.36 and 0.23. Thus, min( 3326(0.02), 3326(0.39), 3326(0.36), 3326(0.23))=min(66.5, 1297.1, 1197.4, 765.0)=66.5. The sample size is more than adequate, so the formula can be used.

Here we have df=k-1=4-1=3 and a 5% level of significance. The appropriate critical value is 7.81 and the decision rule is as follows: Reject H 0 if χ 2 > 7.81.

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) into the formula for the test statistic identified in Step 2. We organize the computations in the following table.

The test statistic is computed as follows:

We reject H 0 because 233.53 > 7.81. We have statistically significant evidence at α=0.05 to show that H 0 is false or that the distribution of BMI in Framingham is different from the national data reported in 2002, p < 0.005.  

Again, the χ 2   goodness-of-fit test allows us to assess whether the distribution of responses "fits" a specified distribution. Here we show that the distribution of BMI in the Framingham Offspring Study is different from the national distribution. To understand the nature of the difference we can compare observed and expected frequencies or observed and expected proportions (or percentages). The frequencies are large because of the large sample size, the observed percentages of patients in the Framingham sample are as follows: 0.6% underweight, 28% normal weight, 41% overweight and 30% obese. In the Framingham Offspring sample there are higher percentages of overweight and obese persons (41% and 30% in Framingham as compared to 36% and 23% in the national data), and lower proportions of underweight and normal weight persons (0.6% and 28% in Framingham as compared to 2% and 39% in the national data). Are these meaningful differences?

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable in a single population. We presented a test using a test statistic Z to test whether an observed (sample) proportion differed significantly from a historical or external comparator. The chi-square goodness-of-fit test can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square goodness-of-fit test.

The NCHS report indicated that in 2002, 75% of children aged 2 to 17 saw a dentist in the past year. An investigator wants to assess whether use of dental services is similar in children living in the city of Boston. A sample of 125 children aged 2 to 17 living in Boston are surveyed and 64 reported seeing a dentist over the past 12 months. Is there a significant difference in use of dental services between children living in Boston and the national data?

We presented the following approach to the test using a Z statistic. 

  • Step 1. Set up hypotheses and determine level of significance

H 0 : p = 0.75

H 1 : p ≠ 0.75                               α=0.05

We must first check that the sample size is adequate. Specifically, we need to check min(np 0 , n(1-p 0 )) = min( 125(0.75), 125(1-0.75))=min(94, 31)=31. The sample size is more than adequate so the following formula can be used

This is a two-tailed test, using a Z statistic and a 5% level of significance. Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. The sample proportion is:

null hypothesis two categorical variables

We reject H 0 because -6.15 < -1.960. We have statistically significant evidence at a =0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data. (p < 0.0001).  

We now conduct the same test using the chi-square goodness-of-fit test. First, we summarize our sample data as follows:

H 0 : p 1 =0.75, p 2 =0.25     or equivalently H 0 : Distribution of responses is 0.75, 0.25 

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ...,np k >) > 5. The sample size here is n=125 and the proportions specified in the null hypothesis are 0.75, 0.25. Thus, min( 125(0.75), 125(0.25))=min(93.75, 31.25)=31.25. The sample size is more than adequate so the formula can be used.

Here we have df=k-1=2-1=1 and a 5% level of significance. The appropriate critical value is 3.84, and the decision rule is as follows: Reject H 0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

(Note that (-6.15) 2 = 37.8, where -6.15 was the value of the Z statistic in the test for proportions shown above.)

We reject H 0 because 37.8 > 3.84. We have statistically significant evidence at α=0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data.  (p < 0.0001). This is the same conclusion we reached when we conducted the test using the Z test above. With a dichotomous outcome, Z 2 = χ 2 !   In statistics, there are often several approaches that can be used to test hypotheses. 

Tests for Two or More Independent Samples, Discrete Outcome

Here we extend that application of the chi-square test to the case with two or more independent comparison groups. Specifically, the outcome of interest is discrete with two or more responses and the responses can be ordered or unordered (i.e., the outcome can be dichotomous, ordinal or categorical). We now consider the situation where there are two or more independent comparison groups and the goal of the analysis is to compare the distribution of responses to the discrete outcome variable among several independent comparison groups.  

The test is called the χ 2 test of independence and the null hypothesis is that there is no difference in the distribution of responses to the outcome across comparison groups. This is often stated as follows: The outcome variable and the grouping variable (e.g., the comparison treatments or comparison groups) are independent (hence the name of the test). Independence here implies homogeneity in the distribution of the outcome among comparison groups.    

The null hypothesis in the χ 2 test of independence is often stated in words as: H 0 : The distribution of the outcome is independent of the groups. The alternative or research hypothesis is that there is a difference in the distribution of responses to the outcome variable among the comparison groups (i.e., that the distribution of responses "depends" on the group). In order to test the hypothesis, we measure the discrete outcome variable in each participant in each comparison group. The data of interest are the observed frequencies (or number of participants in each response category in each group). The formula for the test statistic for the χ 2 test of independence is given below.

Test Statistic for Testing H 0 : Distribution of outcome is independent of groups

and we find the critical value in a table of probabilities for the chi-square distribution with df=(r-1)*(c-1).

Here O = observed frequency, E=expected frequency in each of the response categories in each group, r = the number of rows in the two-way table and c = the number of columns in the two-way table.   r and c correspond to the number of comparison groups and the number of response options in the outcome (see below for more details). The observed frequencies are the sample data and the expected frequencies are computed as described below. The test statistic is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories in each group.  

The data for the χ 2 test of independence are organized in a two-way table. The outcome and grouping variable are shown in the rows and columns of the table. The sample table below illustrates the data layout. The table entries (blank below) are the numbers of participants in each group responding to each response category of the outcome variable.

Table - Possible outcomes are are listed in the columns; The groups being compared are listed in rows.

In the table above, the grouping variable is shown in the rows of the table; r denotes the number of independent groups. The outcome variable is shown in the columns of the table; c denotes the number of response options in the outcome variable. Each combination of a row (group) and column (response) is called a cell of the table. The table has r*c cells and is sometimes called an r x c ("r by c") table. For example, if there are 4 groups and 5 categories in the outcome variable, the data are organized in a 4 X 5 table. The row and column totals are shown along the right-hand margin and the bottom of the table, respectively. The total sample size, N, can be computed by summing the row totals or the column totals. Similar to ANOVA, N does not refer to a population size here but rather to the total sample size in the analysis. The sample data can be organized into a table like the above. The numbers of participants within each group who select each response option are shown in the cells of the table and these are the observed frequencies used in the test statistic.

The test statistic for the χ 2 test of independence involves comparing observed (sample data) and expected frequencies in each cell of the table. The expected frequencies are computed assuming that the null hypothesis is true. The null hypothesis states that the two variables (the grouping variable and the outcome) are independent. The definition of independence is as follows:

 Two events, A and B, are independent if P(A|B) = P(A), or equivalently, if P(A and B) = P(A) P(B).

The second statement indicates that if two events, A and B, are independent then the probability of their intersection can be computed by multiplying the probability of each individual event. To conduct the χ 2 test of independence, we need to compute expected frequencies in each cell of the table. Expected frequencies are computed by assuming that the grouping variable and outcome are independent (i.e., under the null hypothesis). Thus, if the null hypothesis is true, using the definition of independence:

P(Group 1 and Response Option 1) = P(Group 1) P(Response Option 1).

 The above states that the probability that an individual is in Group 1 and their outcome is Response Option 1 is computed by multiplying the probability that person is in Group 1 by the probability that a person is in Response Option 1. To conduct the χ 2 test of independence, we need expected frequencies and not expected probabilities . To convert the above probability to a frequency, we multiply by N. Consider the following small example.

The data shown above are measured in a sample of size N=150. The frequencies in the cells of the table are the observed frequencies. If Group and Response are independent, then we can compute the probability that a person in the sample is in Group 1 and Response category 1 using:

P(Group 1 and Response 1) = P(Group 1) P(Response 1),

P(Group 1 and Response 1) = (25/150) (62/150) = 0.069.

Thus if Group and Response are independent we would expect 6.9% of the sample to be in the top left cell of the table (Group 1 and Response 1). The expected frequency is 150(0.069) = 10.4.   We could do the same for Group 2 and Response 1:

P(Group 2 and Response 1) = P(Group 2) P(Response 1),

P(Group 2 and Response 1) = (50/150) (62/150) = 0.138.

The expected frequency in Group 2 and Response 1 is 150(0.138) = 20.7.

Thus, the formula for determining the expected cell frequencies in the χ 2 test of independence is as follows:

Expected Cell Frequency = (Row Total * Column Total)/N.

The above computes the expected frequency in one step rather than computing the expected probability first and then converting to a frequency.  

In a prior example we evaluated data from a survey of university graduates which assessed, among other things, how frequently they exercised. The survey was completed by 470 graduates. In the prior example we used the χ 2 goodness-of-fit test to assess whether there was a shift in the distribution of responses to the exercise question following the implementation of a health promotion campaign on campus. We specifically considered one sample (all students) and compared the observed distribution to the distribution of responses the prior year (a historical control). Suppose we now wish to assess whether there is a relationship between exercise on campus and students' living arrangements. As part of the same survey, graduates were asked where they lived their senior year. The response options were dormitory, on-campus apartment, off-campus apartment, and at home (i.e., commuted to and from the university). The data are shown below.

Based on the data, is there a relationship between exercise and student's living arrangement? Do you think where a person lives affect their exercise status? Here we have four independent comparison groups (living arrangement) and a discrete (ordinal) outcome variable with three response options. We specifically want to test whether living arrangement and exercise are independent. We will run the test using the five-step approach.  

H 0 : Living arrangement and exercise are independent

H 1 : H 0 is false.                α=0.05

The null and research hypotheses are written in words rather than in symbols. The research hypothesis is that the grouping variable (living arrangement) and the outcome variable (exercise) are dependent or related.   

  • Step 2.  Select the appropriate test statistic.  

The condition for appropriate use of the above test statistic is that each expected frequency is at least 5. In Step 4 we will compute the expected frequencies and we will ensure that the condition is met.

The decision rule depends on the level of significance and the degrees of freedom, defined as df = (r-1)(c-1), where r and c are the numbers of rows and columns in the two-way data table.   The row variable is the living arrangement and there are 4 arrangements considered, thus r=4. The column variable is exercise and 3 responses are considered, thus c=3. For this test, df=(4-1)(3-1)=3(2)=6. Again, with χ 2 tests there are no upper, lower or two-tailed tests. If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. The rejection region for the χ 2 test of independence is always in the upper (right-hand) tail of the distribution. For df=6 and a 5% level of significance, the appropriate critical value is 12.59 and the decision rule is as follows: Reject H 0 if c 2 > 12.59.

We now compute the expected frequencies using the formula,

Expected Frequency = (Row Total * Column Total)/N.

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency.   The expected frequencies are shown in parentheses.

Notice that the expected frequencies are taken to one decimal place and that the sums of the observed frequencies are equal to the sums of the expected frequencies in each row and column of the table.  

Recall in Step 2 a condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 9.6) and therefore it is appropriate to use the test statistic.

We reject H 0 because 60.5 > 12.59. We have statistically significant evidence at a =0.05 to show that H 0 is false or that living arrangement and exercise are not independent (i.e., they are dependent or related), p < 0.005.  

Again, the χ 2 test of independence is used to test whether the distribution of the outcome variable is similar across the comparison groups. Here we rejected H 0 and concluded that the distribution of exercise is not independent of living arrangement, or that there is a relationship between living arrangement and exercise. The test provides an overall assessment of statistical significance. When the null hypothesis is rejected, it is important to review the sample data to understand the nature of the relationship. Consider again the sample data. 

Because there are different numbers of students in each living situation, it makes the comparisons of exercise patterns difficult on the basis of the frequencies alone. The following table displays the percentages of students in each exercise category by living arrangement. The percentages sum to 100% in each row of the table. For comparison purposes, percentages are also shown for the total sample along the bottom row of the table.

From the above, it is clear that higher percentages of students living in dormitories and in on-campus apartments reported regular exercise (31% and 23%) as compared to students living in off-campus apartments and at home (10% each).  

Test Yourself

 Pancreaticoduodenectomy (PD) is a procedure that is associated with considerable morbidity. A study was recently conducted on 553 patients who had a successful PD between January 2000 and December 2010 to determine whether their Surgical Apgar Score (SAS) is related to 30-day perioperative morbidity and mortality. The table below gives the number of patients experiencing no, minor, or major morbidity by SAS category.  

Question: What would be an appropriate statistical test to examine whether there is an association between Surgical Apgar Score and patient outcome? Using 14.13 as the value of the test statistic for these data, carry out the appropriate test at a 5% level of significance. Show all parts of your test.

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable and two independent comparison groups. We presented a test using a test statistic Z to test for equality of independent proportions. The chi-square test of independence can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square test of independence.

A randomized trial is designed to evaluate the effectiveness of a newly developed pain reliever designed to reduce pain in patients following joint replacement surgery. The trial compares the new pain reliever to the pain reliever currently in use (called the standard of care). A total of 100 patients undergoing joint replacement surgery agreed to participate in the trial. Patients were randomly assigned to receive either the new pain reliever or the standard pain reliever following surgery and were blind to the treatment assignment. Before receiving the assigned treatment, patients were asked to rate their pain on a scale of 0-10 with higher scores indicative of more pain. Each patient was then given the assigned treatment and after 30 minutes was again asked to rate their pain on the same scale. The primary outcome was a reduction in pain of 3 or more scale points (defined by clinicians as a clinically meaningful reduction). The following data were observed in the trial.

We tested whether there was a significant difference in the proportions of patients reporting a meaningful reduction (i.e., a reduction of 3 or more scale points) using a Z statistic, as follows. 

H 0 : p 1 = p 2    

H 1 : p 1 ≠ p 2                             α=0.05

Here the new or experimental pain reliever is group 1 and the standard pain reliever is group 2.

We must first check that the sample size is adequate. Specifically, we need to ensure that we have at least 5 successes and 5 failures in each comparison group or that:

In this example, we have

Therefore, the sample size is adequate, so the following formula can be used:

Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. We first compute the overall proportion of successes:

We now substitute to compute the test statistic.

  • Step 5.  Conclusion.  

We now conduct the same test using the chi-square test of independence.  

H 0 : Treatment and outcome (meaningful reduction in pain) are independent

H 1 :   H 0 is false.         α=0.05

The formula for the test statistic is:  

For this test, df=(2-1)(2-1)=1. At a 5% level of significance, the appropriate critical value is 3.84 and the decision rule is as follows: Reject H0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

We now compute the expected frequencies using:

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency. The expected frequencies are shown in parentheses.

A condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 22.0) and therefore it is appropriate to use the test statistic.

(Note that (2.53) 2 = 6.4, where 2.53 was the value of the Z statistic in the test for proportions shown above.)

Chi-Squared Tests in R

The video below by Mike Marin demonstrates how to perform chi-squared tests in the R programming language.

Answer to Problem on Pancreaticoduodenectomy and Surgical Apgar Scores

We have 3 independent comparison groups (Surgical Apgar Score) and a categorical outcome variable (morbidity/mortality). We can run a Chi-Squared test of independence.

H 0 : Apgar scores and patient outcome are independent of one another.

H A : Apgar scores and patient outcome are not independent.

Chi-squared = 14.3

Since 14.3 is greater than 9.49, we reject H 0.

There is an association between Apgar scores and patient outcome. The lowest Apgar score group (0 to 4) experienced the highest percentage of major morbidity or mortality (16 out of 57=28%) compared to the other Apgar score groups.

AIM logo Black

  • Conferences
  • Last updated May 17, 2022
  • In AI Mysteries

How to use the Chi-Square Test for two categorical variables?

null hypothesis two categorical variables

Illustration by Analytics India Magazine

  • Published on May 17, 2022
  • by Sourabh Mehta

null hypothesis two categorical variables

The Chi-Square test of independence examines whether or not two nominal (categorical) variables have a significant connection. Comparing the frequency of categories for one nominal variable with the frequency of categories for the second nominal variable. There are two utilizations of the chi-square test to check the independence of variables and to check the goodness of fit . In the article, we will be discussing the when and where the chi-square test could be utilized. Following are the topics to be covered.

Table of contents

A brief about chi-square.

  • Uses of chi-square

Implementing chi-square test

Let’s start by talking about the chi-square test.

The objective is to determine whether the association between two qualitative variables is statistically significant. 

The formulation of the hypotheses for this statistical analysis is something like this.

  • Null Hypothesis (H0): There is no substantial relationship between the two variables (in case of independence test), or there is no difference in variable distribution (in case of goodness of fit). 
  • Alternative Hypothesis (H1): There is a substantial relationship between variables (in case of independence test) or a significant variation in variable distribution (in case of goodness of fit).

When the null hypothesis is true, the anticipated values for each cell in the table must be specified. The anticipated values describe what the values of each cell in the table would be if the two variables were not associated. The sample size, row totals, and column totals are all required by the algorithm for calculating anticipated values.

The chi-square statistic compares observed and anticipated values. This test statistic is used to see if the discrepancy between observed and predicted values is statistically significant.

Are you looking for a complete repository of Python libraries used in data science,  check out here .

Uses of Chi-square test

A chi-square test is used to examine if observed findings are consistent with predicted outcomes and to rule out the possibility that observations are due to chance. When the data being studied is from a random sample and the variable in issue is a categorical variable, the chi-square test is applicable. These sorts of data are frequently gathered through survey replies or questionnaires. As a result, chi-square analysis is frequently the most effective in assessing this sort of data.

There are two main kinds of chi-square tests: the test of independence and the goodness-of-fit test.

Independence

When considering categorical variables that may be interdependent. A chi-square test for independence might be used to assess the association between categorical variables.

Assume there are two variables: gender and degree course and need to check whether gender depends on the course or course depends on gender. Then, using the chi-square formula of observed and predicted values, compare the frequency with which male and female employees choose from the available courses.

If there is no relationship between gender and course, implying that they are independent of one another, then the actual frequencies at which both genders choose each offered course should be expected to be approximately equal. The ratio of the gender of students in any selected degree should be approximately equal to the ratio of the gender in the sample.

A chi-square test for independence might indicate how probable it is that any observed divergence between the actual frequencies in the data and these theoretical predictions can be explained by random chance.

Goodness-of-Fit

The chi-square provides a way to test how well a sample of data matches the characteristics of the larger population that the sample is intended to represent. The sample data cannot be used to draw conclusions about the larger population if they do not reflect those of the target population. This kind of chi-square test is goodness-of-fit.

Assume a small library has the greatest number of members on Fridays and Sundays the average amount on Mondays, Tuesdays, and Saturdays, and the fewest on Wednesdays and Thursdays. Based on these predictions, the library hires a set number of employees each day to check in members, clean facilities, guards, and librarians.

However, the library is losing money, and the owner wants to determine if the frequency of member assumptions and staffing levels are right. For six weeks, the owner chooses to count the number of library visitors every day. They can then use a chi-square goodness-of-fit test to compare the library’s assumed attendance to its observed attendance. With the additional information, they can better manage the library and increase revenue.

Let’s implement this test on a dataset and solve some problems using python

In python, the scipy library offers a stats module under which we can find all the chi-square test-related attributes.

Let’s check the independence of categorical variables

null hypothesis two categorical variables

To check the dependency of interested categorical variables need to create a contingency table because the chi-square test could only be performed at tables.

null hypothesis two categorical variables

Checking the dependency between the education level of employees and their graduation degree.  So the contingency table is made for them and it looks like the above image.

Now this table is an input in the chi-square function offered by spicy. This function will calculate the test statistics value, p-value, degrees of freedom and expected values.

null hypothesis two categorical variables

The critical value for the chi-square test with a degree of freedom of 10 and alpha 0.05 is 18.307. Since the test statistic (18.576) exceeds the critical value the null hypothesis is rejected. Therefore, the education level and the graduation degree are dependent on each other.

Let’s check the goodness-of-fit of gender and relevant experience. In this test, the chi-square will fit one categorical variable to a distribution. The process is the same as the above: create a contingency table and use the formula but in this let’s do it from scratch rather than directly applying the contingency chi-square function. 

null hypothesis two categorical variables

Now calculate the observed and expected values and degrees of freedom

null hypothesis two categorical variables

We are all set to calculate the chi-square static value

null hypothesis two categorical variables

The critical value for the chi-square test with a degree of freedom of 2 and alpha 0.05 is 5.991. Since the test statistic (10.861) exceeds the critical value the null hypothesis is rejected. Therefore, gender and relevant experience have a significant variation in the distribution. This could be concluded that there is a substantial dependency of gender on relevant experience according to the data.

Chi-square is the test for understanding the relationship between two categorical variables. One can understand whether the dependency and fitness of categorical variables are related to others. With this hands-on implementation, we understood the chi-square test and when to use chi-square. 

  • Link to the above codes
  • Read further about chi-square

Access all our open Survey & Awards Nomination forms in one place

Picture of Sourabh Mehta

Sourabh Mehta

Statistics for Data Science beginners

A full-day hands-on workshop on Statistics for Data Science Beginners

null hypothesis two categorical variables

When not to remove outliers from data?

null hypothesis two categorical variables

A hands-on guide to Frequentist Vs Bayesian approaches in statistics

null hypothesis two categorical variables

A hands-on guide to dummy variable trap with a solution in Python

null hypothesis two categorical variables

A hands-on guide to Pandera: A statistical DataFrame testing toolkit 

null hypothesis two categorical variables

A beginner’s guide to Student’s t-test in python from scratch

null hypothesis two categorical variables

Beginner’s guide to Chi-square Test in Python from Scratch

null hypothesis two categorical variables

How to build a data science portfolio in college?

Corse5 intelligence

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative ai skilling for enterprises, our customized corporate training program on generative ai provides a unique opportunity to empower, retain, and advance your talent., upcoming large format conference, data engineering summit 2024, may 30 and 31, 2024 | 📍 bangalore, india, download the easiest way to stay informed.

null hypothesis two categorical variables

Snowflake Releases Open Enterprise LLM, Arctic with 480 Billion Parameters

Arctic activates approximately 50% fewer parameters than DBRX, and 80% fewer than Grok-1 during inference or training.

null hypothesis two categorical variables

Now Run Programs in Real Time with Llama 3 on Groq

null hypothesis two categorical variables

Fibe Leverages Amazon Bedrock to Increase Customer Support Efficiency by 30% 

Top editorial picks, six months old cognition labs raises $175 mn from founders fund at $2 bn valuation, apple releases four open source llms with openelm series of models, adobe launches firefly image 3 beta with auto stylisation, structure reference capabilities, c.p. gurnani & interglobe’s rahul bhatia announce ai business venture aionos , subscribe to the belamy: our weekly newsletter, biggest ai stories, delivered to your inbox every week., also in news.

Guardians of the Syntax: Securing Enterprise LLM Systems against Emerging Threats

Guardians of the Syntax: Securing Enterprise LLM Systems against Emerging Threats

This 18-Year-Old Programmer is Creating an Open Source Alternative to Redis

This 18-Year-Old Programmer is Creating an Open Source Alternative to Redis

US India Investments

India will Need at least $200-300 Mn to Build GPT-5-level AI Model

Doctors Use Apple Vision Pro to Enhance Shoulder Arthroscopy Surgery

Doctors Use Apple Vision Pro to Enhance Shoulder Arthroscopy Surgery

null hypothesis two categorical variables

10 AI Tools to Complete Excel Tasks in Minutes

You Don’t Need a Degree to Get an AI Job

You Don’t Need a Degree to Get an AI Job

null hypothesis two categorical variables

UAE Turns to India to Spearhead AI Innovations

Microsoft’s Phi-3 Outperforms Meta’s Llama 3 and Fits Perfectly on an iPhone

Microsoft’s Phi-3 Outperforms Meta’s Llama 3 and Fits Perfectly on an iPhone

Ai courses & careers.

null hypothesis two categorical variables

India is a Goldmine for AI Talent 

null hypothesis two categorical variables

Top 10 LMS Platforms for Enterprise AI Training and Development

null hypothesis two categorical variables

AI Clock is Ticking: Wake Up Call for Education Institutions 

Become a certified generative ai engineer, industry insights.

null hypothesis two categorical variables

BCG Predicts AI to Drive 20% of 2024 Revenues, Doubling to 40% by 2026

null hypothesis two categorical variables

AI Can Now Edit DNA of Human Cells

Genpact Launches AI Innovation Centre in Gurugram

Genpact Launches AI Innovation Centre in Gurugram

Check our industry research reports.

null hypothesis two categorical variables

AI Forum for India

Our discord community for ai ecosystem, in collaboration with nvidia. .

What is Computer Vision and How it Works?

"> "> Flagship Events

Rising 2024 | de&i in tech summit, april 4 and 5, 2024 | 📍 hilton convention center, manyata tech park, bangalore, machinecon gcc summit 2024, june 28 2024 | 📍bangalore, india, machinecon usa 2024, 26 july 2024 | 583 park avenue, new york, cypher india 2024, september 25-27, 2024 | 📍bangalore, india, cypher usa 2024, nov 21-22 2024 | 📍santa clara convention center, california, usa, genai corner.

null hypothesis two categorical variables

7 AI Startups that Featured on Shark Tank India Season 3

null hypothesis two categorical variables

Top 9 Semiconductor GCCs in India

null hypothesis two categorical variables

Top 6 Devin Alternatives to Automate Your Coding Tasks 

null hypothesis two categorical variables

10 Free AI Courses by NVIDIA

null hypothesis two categorical variables

Top 6 AI/ML Hackathons to Participate in 2024 

null hypothesis two categorical variables

What’s Devin Up to?

null hypothesis two categorical variables

10 Underrated Women in AI to Watchout For

null hypothesis two categorical variables

10 AI Startups Run by Incredible Women Entrepreneurs 

Data dialogues.

null hypothesis two categorical variables

Automation Anywhere Wants to Augment Humans with AI, Not Replace Them

null hypothesis two categorical variables

Father of Computational Theory Wins 2023 Turing Award

Falcon- TII- UAE

Building Open Source LLMs is Not for Everyone 

This 20-year-old AI Researcher Created the much-needed Indic LLM Leaderboard

This 20-year-old AI Researcher Created the much-needed Indic LLM Leaderboard

null hypothesis two categorical variables

NPCI is Exploring AI-Powered Futuristic Payment Frontiers: CTO

Prisma AI

Prisma AI Has an ‘Eye on You’ at Adani Airports

null hypothesis two categorical variables

Salesforce Chief Ethicist Deems Doomsday AI Discussions a ‘Waste of Time’

ManageEngine Zoho

Zoho’s ManageEngine Invests $10 Mn in NVIDIA, Intel, and AMD GPUs

Future talks.

ai jobs india

T-Hub Supported MATH is Launching AI Career Finder to Create AI Jobs 

null hypothesis two categorical variables

Quora’s Poe Eats Google’s Lunch

Zoho teams up with Intel for optimizing video AI workloads

Zoho Collaborates with Intel to Optimise & Accelerate Video AI Workloads

null hypothesis two categorical variables

Rakuten Certified as Best Firm for Data Scientists for the 2nd Time

bulls.ai

This Indian Logistics Company Developed an LLM to Enhance Last-Mile Delivery 

Perplexity AI

Perplexity AI Reviews with Pro Access

Apple WWDC 2024

What to Expect at the ‘Absolutely Incredible’ Apple WWDC 2024

Code Generator

Will StarCoder 2 Win Over Enterprises?

Developer’s corner.

null hypothesis two categorical variables

Japan is the Next Big Hub for Indian Tech Talent

null hypothesis two categorical variables

Will TypeScript Wipe Out JavaScript? 

Meta Llama 3

Meta Forces Developers Cite ‘Llama 3’ in their AI Development

Why Developers Hate Jira

Why Developers Hate Jira

In case you missed it.

Which is the Most Frustrating Programming Language?

Which is the Most Frustrating Programming Language?

null hypothesis two categorical variables

AI4Bharat Rolls Out IndicLLMSuite for Building LLMs in Indian Languages

null hypothesis two categorical variables

Google Introduces Synth^2 to Enhance the Training of Visual Language Models  

Infosys Funds Llama 2 Project with 22 Indian Languages

Infosys Founder Funds Meta’s Llama 2 Project with 22 Indian Languages

Excel tools

9 Best AI Tools for Excel and Google Spread Sheet Automation

Generative AI Certification Courses

8 Best Generative AI Courses for Executives and Managers

Add ChatGPT Chrome Extension Right Away

Top 8 AI Browser Extensions for Chrome Users in 2024

Dead Programming Languages

Top 5 Devin AI Alternatives for Coders and Developers

Programming language concept. System engineering. Software development.

10 Best AI Code Generator Tools to Use for Free in 2024

STAR Framework for Measuring AI Trust: Safety, Transparency, Accountability and Responsibility

What are the Responsibility of Developers Using Generative AI

Also in trends.

How Good is Llama 3 for Indic Languages?

AWS Brings Meta’s Llama 3 Models on Amazon Bedrock

null hypothesis two categorical variables

Cohere Unveils SnapKV to Cut Memory & Processing Time in LLMs

null hypothesis two categorical variables

‘We May be Able to Create an Infinite Data Generation Engine with Synthetic Data,’ says Anthropic CEO 

What’s Up with ChatGPT Enterprise

OpenAI Introduces New Enterprise-Grade Features for API Customers

India leads global ai project implementation: report reveals.

Daniel Dines UiPath

UiPath Launches New Data Centers in Pune, Chennai to Expand India Footprint

Perplexity AI

Perplexity AI Raises $63M at $1B Valuation, Expands to Enterprise Market

G42 Qualcomm

UAE continues to spearhead global collaborations with G42 selecting Qualcomm for AI Inference 

World's biggest media & analyst firm specializing in ai, advertise with us, aim publishes every day, and we believe in quality over quantity, honesty over spin. we offer a wide variety of branding and targeting options to make it easy for you to propagate your brand., branded content, aim brand solutions, a marketing division within aim, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories., corporate upskilling, adasci corporate training program on generative ai provides a unique opportunity to empower, retain and advance your talent, with machinehack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons., talent assessment, conduct customized online assessments on our powerful cloud-based platform, secured with best-in-class proctoring, research & advisory, aim research produces a series of annual reports on ai & data science covering every aspect of the industry. request customised reports & aim surveys for a study on topics of your interest., conferences & events, immerse yourself in ai and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives., aim launches the 3rd edition of data engineering summit. may 30-31, bengaluru.

Join the forefront of data innovation at the Data Engineering Summit 2024, where industry leaders redefine technology’s future.

© Analytics India Magazine Pvt Ltd & AIM Media House LLC 2024

  • Terms of use
  • Privacy Policy

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.

The University of Texas at Austin

Statistics Online Support

Statistics Online Support provides tutorials and instruction on commonly used statistical techniques. Tutorials are included for both Excel and in R.

Chi-Square Test of Independence

This test is used to determine if two categorical variables are independent or if they are in fact related to one another. If two categorical variables are independent, then the value of one variable does not change the probability distribution of the other. If two categorical variables are related, then the distribution of one depends on the level the other. This test measures the differences in the observed conditional distribution of one variable across levels of the other, and compares it to the marginal (overall) distribution of that variable.

Conditional vs. Marginal Distribution

A conditional distribution is the distribution of all levels of one variable given that the other variable equals some value. The marginal distribution is the overall distribution of one variable, ignoring the other.

For example, take the following data:

table2sos

The marginal distribution of hair color is 43% blonde/57% brunette. The conditional distribution of hair color for women is 47% blonde/53% brunette, while the conditional distribution of hair color for men is 36% blonde/67% brunette.

The chi-square test of independence will determine whether the differences between the conditional and marginal distributions are significant, or if they are small enough to be expected simply by random chance.

Assumptions:

  • Random samples
  • Independent observations
  • The sample size is large enough such that all expected frequencies are greater than 1 and at least 80% are greater than 5.

If your data fails the sample size assumption, try collapsing categories to increase the expected frequencies. If that is not possible, an alternative test is Fisher’s Exact test .

Hypotheses:

H o : The variables are independent. H A : The variables are not independent (meaning they are related).

Relevant Equations:

Degrees of freedom : (number of rows – 1)*(number of columns – 1)

Expected counts for each cell: (row total*column total)/grand total

chisquare

Example 1: Hand calculation

This video analyzes if treatment group and symptom status are independent for participants in a randomized drug trial.

Sample conclusion: After checking the assumptions of random sampling and noting that none of the expected counts for our data were less than 5, we completed a chi-square test of independence to determine if treatment group and symptom status are independent. We failed to reject the null hypothesis and found evidence that treatment and symptoms are independent (X 2 ( df =1)=3.42, p >.05).

Example 2: Performing analysis in Excel 2016 on

These videos analyze if phone type and beliefs about the impact of social media are independent.

To calculate a chi-square test in Excel, you must first create a contingency table of the data. The first video below describes this process. The second video runs the chi-square test.

Dataset used in video

Creating contingency tables and pie charts: PDF corresponding to video

This video shows how to make a contingency table of phone type and beliefs about the impact of social media.

Creating contingency tables and stacked bar charts: PDF corresponding to video

This video shows how to make a contingency table of phone type and beliefs about the impact of social media.

Performing the test of independence: PDF corresponding to video

This video shows how to make conduct a test of independence for phone type and beliefs about the impact of social media.

Example 3: Performing analysis in R

This dataset contains information about musicians who have performed on ACL Live, and this video analyzes if the categorical age of these artists (20s, 30s, etc.) is related to whether or not they’ve won a Grammy.

Dataset used in video R script file used in video

This dataset is about musicians who participated in the Austin City Limits music festival. This video analyzes if the categorical age of the artists (20s, 30s, etc) and whether or not they won a Grammy are independent.

Social Widgets powered by AB-WebLog.com .

  • Flashes Safe Seven
  • FlashLine Login
  • Faculty & Staff Phone Directory
  • Emeriti or Retiree
  • All Departments
  • Maps & Directions

Kent State University Home

  • Building Guide
  • Departments
  • Directions & Parking
  • Faculty & Staff
  • Give to University Libraries
  • Library Instructional Spaces
  • Mission & Vision
  • Newsletters
  • Circulation
  • Course Reserves / Core Textbooks
  • Equipment for Checkout
  • Interlibrary Loan
  • Library Instruction
  • Library Tutorials
  • My Library Account
  • Open Access Kent State
  • Research Support Services
  • Statistical Consulting
  • Student Multimedia Studio
  • Citation Tools
  • Databases A-to-Z
  • Databases By Subject
  • Digital Collections
  • Discovery@Kent State
  • Government Information
  • Journal Finder
  • Library Guides
  • Connect from Off-Campus
  • Library Workshops
  • Subject Librarians Directory
  • Suggestions/Feedback
  • Writing Commons
  • Academic Integrity
  • Jobs for Students
  • International Students
  • Meet with a Librarian
  • Study Spaces
  • University Libraries Student Scholarship
  • Affordable Course Materials
  • Copyright Services
  • Selection Manager
  • Suggest a Purchase

Library Locations at the Kent Campus

  • Architecture Library
  • Fashion Library
  • Map Library
  • Performing Arts Library
  • Special Collections and Archives

Regional Campus Libraries

  • East Liverpool
  • College of Podiatric Medicine

null hypothesis two categorical variables

  • Kent State University
  • SAS Tutorials

Chi-Square Test of Independence

Sas tutorials: chi-square test of independence.

  • The SAS 9.4 User Interface
  • SAS Syntax Rules
  • SAS Libraries
  • The Data Step
  • Informats and Formats
  • User-Defined Formats (Value Labels)
  • Defining Variables
  • Missing Values
  • Importing Excel Files into SAS
  • Computing New Variables
  • Date-Time Functions and Variables in SAS
  • Sorting Data
  • Subsetting and Splitting Datasets
  • Merging Datasets
  • Transposing Data using PROC TRANSPOSE
  • Summarizing dataset contents with PROC CONTENTS
  • Viewing Data
  • Frequency Tables using PROC FREQ
  • Crosstabs using PROC FREQ
  • Pearson Correlation with PROC CORR
  • t tests are used to test if the means of two independent groups are significantly different. In SAS, PROC TTEST with a CLASS statement and a VAR statement can be used to conduct an independent samples t test." href="https://libguides.library.kent.edu/SAS/IndependentTTest" style="" >Independent Samples t Test
  • t tests are used to test if the means of two paired measurements, such as pretest/posttest scores, are significantly different. In SAS, PROC TTEST with a PAIRED statement can be used to conduct a paired samples t test." href="https://libguides.library.kent.edu/SAS/PairedSamplestTest" style="" >Paired Samples t Test
  • Exporting Results to Word or PDF
  • Importing Data into SAS OnDemand for Academics
  • Connecting to WRDS from SAS
  • SAS Resources Online
  • How to Cite the Tutorials

Sample Data Files

Our tutorials reference a dataset called "sample" in many examples. If you'd like to download the sample dataset to work through the examples, choose one of the files below:

  • Data definitions (*.pdf)
  • Data - Comma delimited (*.csv)
  • Data - Tab delimited (*.txt)
  • Data - Excel format (*.xlsx)
  • Data - SAS format (*.sas7bdat)
  • Data - SPSS format (*.sav)
  • SPSS Syntax (*.sps) Syntax to add variable labels, value labels, set variable types, and compute several recoded variables used in later tutorials.
  • SAS Syntax (*.sas) Syntax to read the CSV-format sample data and set variable labels and formats/value labels.

The Chi-Square Test of Independence determines whether there is an association between categorical variables (i.e., whether the variables are independent or related). It is a nonparametric test.

This test is also known as:

  • Chi-Square Test of Association.

This test utilizes a contingency table to analyze the data. A contingency table (also known as a cross-tabulation , crosstab , or two-way table ) is an arrangement in which data is classified according to two categorical variables. The categories for one variable appear in the rows, and the categories for the other variable appear in columns. Each variable must have two or more categories. Each cell reflects the total count of cases for a specific pair of categories.

There are several tests that go by the name "chi-square test" in addition to the Chi-Square Test of Independence. Look for context clues in the data and research question to make sure what form of the chi-square test is being used.

Common Uses

The Chi-Square Test of Independence is commonly used to test the following:

  • Statistical independence or association between two categorical variables.

The Chi-Square Test of Independence can only compare categorical variables. It cannot make comparisons between continuous variables or between categorical and continuous variables. Additionally, the Chi-Square Test of Independence only assesses associations between categorical variables, and can not provide any inferences about causation.

If your categorical variables represent "pre-test" and "post-test" observations, then the chi-square test of independence is not appropriate . This is because the assumption of the independence of observations is violated. In this situation, McNemar's Test is appropriate.

Data Requirements

Your data must meet the following requirements:

  • Two categorical variables.
  • Two or more categories (groups) for each variable.
  • There is no relationship between the subjects in each group.
  • The categorical variables are not "paired" in any way (e.g. pre-test/post-test observations).
  • Expected frequencies for each cell are at least 1.
  • Expected frequencies should be at least 5 for the majority (80%) of the cells.

The null hypothesis ( H 0 ) and alternative hypothesis ( H 1 ) of the Chi-Square Test of Independence can be expressed in two different but equivalent ways:

H 0 : "[ Variable 1 ] is independent of [ Variable 2 ]" H 1 : "[ Variable 1 ] is not independent of [ Variable 2 ]"

H 0 : "[ Variable 1 ] is not associated with [ Variable 2 ]" H 1 :  "[ Variable 1 ] is associated with [ Variable 2 ]"

Data Set-Up

Your dataset should have the following structure:

  • Each case (row) represents a subject, and each subject appears once in the dataset, represented in columns. That is, each row represents an observation from a unique subject.
  • The dataset contains at least two nominal categorical variables (string or numeric). The categorical variables used in the test must have two or more categories; they should also not have too many categories.

Example of a dataset structure where each row represents a case or subject. Screenshot shows a Vietable window with cases 1-11 and 425-435 from the sample dataset, with columns ids, Class rank, Gender, and Athlete.

Test Statistic

The test statistic for the Chi-Square Test of Independence is denoted Χ 2 , and is computed as:

$$ \chi^{2} = \sum_{i=1}^{R}{\sum_{j=1}^{C}{\frac{(o_{ij} - e_{ij})^{2}}{e_{ij}}}} $$

\(o_{ij}\) is the observed cell count in the i th row and j th column of the table

\(e_{ij}\) is the expected cell count in the i th row and j th column of the table, computed as

$$ e_{ij} = \frac{\mathrm{ \textrm{row } \mathit{i}} \textrm{ total} * \mathrm{\textrm{col } \mathit{j}} \textrm{ total}}{\textrm{grand total}} $$

The quantity ( o ij - e ij ) is sometimes referred to as the residual of cell ( i , j ), denoted \(r_{ij}\).

The calculated Χ 2 value is then compared to the critical value from the Χ 2 distribution table with degrees of freedom df = ( R - 1)( C - 1) and chosen confidence level. If the calculated Χ 2 value > critical Χ 2 value, then we reject the null hypothesis.

Run a Chi-Square Test of Independence with PROC FREQ

The general form is

The CHISQ option is added to the TABLES statement after the slash ( / ) character.

Many of PROC FREQ's most useful options have been covered in the tutorials on Frequency Tables and Crosstabs , but there are several additional options that can be useful when conducting a chi-square test of independence:

  • EXPECTED Adds expected cell counts to the cells of the crosstab table.
  • DEVIATION Adds deviation values (i.e., observed minus expected values) to the cells of the crosstab table.

Example: Chi-square Test for 2x2 Table

Problem statement.

Let's continue the row and column percentage example from the Crosstabs tutorial, which described the relationship between the variables RankUpperUnder (upperclassman/underclassman) and LivesOnCampus (lives on campus/lives off-campus). Recall that the column percentages of the crosstab appeared to indicate that upperclassmen were less likely than underclassmen to live on campus:

  • The proportion of underclassmen who live off campus is 34.8%, or 79/227.
  • The proportion of underclassmen who live on campus is 65.2%, or 148/227.
  • The proportion of upperclassmen who live off campus is 94.4%, or 152/161.
  • The proportion of upperclassmen who live on campus is 5.6%, or 9/161.

Suppose that we want to test the association between class rank and living on campus using a Chi-Square Test of Independence (using α = 0.05).

The first table in the output is the crosstabulation. If you included the EXPECTED and DEVIATION options in your syntax, you should see the following:

Crosstab produced by PROC FREQ when the EXPECTED, DEVIATION, NOROW, NOCOL, and NOPERCENT options are used.

With the Expected Count values shown, we can confirm that all cells have an expected value greater than 5.

These numbers can be plugged into the chi-square test statistic formula:

$$ \chi^{2} = \sum_{i=1}^{R}{\sum_{j=1}^{C}{\frac{(o_{ij} - e_{ij})^{2}}{e_{ij}}}} = \frac{(-56.15)^{2}}{135.15} + \frac{(56.147)^{2}}{91.853} + \frac{(56.147)^{2}}{95.853} + \frac{(-56.15)^{2}}{65.147} = 138.926 $$

We can confirm this computation with the results in the table labeled Statistics for Table of RankUpperUnder by LiveOnCampus :

Output table generated from the CHISQ option in PROC FREQ.

The row of interest here is Chi-Square .

  • The value of the test statistic is 138.926.
  • Because the crosstabulation is a 2x2 table, the degrees of freedom (df) for the test statistic is $$ df = (R - 1)*(C - 1) = (2 - 1)*(2 - 1) = 1 $$.
  • The corresponding p-value of the test statistic is so small that it is presented as p < 0.001.

Decision and Conclusions

Since the p-value is less than our chosen significance level α = 0.05, we can reject the null hypothesis, and conclude that there is an association between class rank and whether or not students live on-campus.

Based on the results, we can state the following:

  • There was a significant association between class rank and living on campus ( Χ 2 (1) = 138.9, p < .001).

Tutorial Feedback

  • << Previous: Pearson Correlation with PROC CORR
  • Next: Independent Samples t Test >>
  • Last Updated: Dec 18, 2023 12:59 PM
  • URL: https://libguides.library.kent.edu/SAS

Street Address

Mailing address, quick links.

  • How Are We Doing?
  • Student Jobs

Information

  • Accessibility
  • Emergency Information
  • For Our Alumni
  • For the Media
  • Jobs & Employment
  • Life at KSU
  • Privacy Statement
  • Technology Support
  • Website Feedback

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics   >   unit 10.

  • Idea behind hypothesis testing

Examples of null and alternative hypotheses

  • Writing null and alternative hypotheses
  • P-values and significance tests
  • Comparing P-values to different significance levels
  • Estimating a P-value from a simulation
  • Estimating P-values from simulations
  • Using P-values to make conclusions

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Good Answer

Video transcript

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

11.9: Test of Homogeneity

  • Last updated
  • Save as PDF
  • Page ID 14199

Learning Objectives

  • Conduct a chi-square test of homogeneity. Interpret the conclusion in context.

We have learned the details for two chi-square tests, the goodness-of-fit test, and the test of independence. Now we focus on the third and last chi-square test that we will learn, the test for homogeneity . This test determines if two or more populations (or subgroups of a population) have the same distribution of a single categorical variable.

The test of homogeneity expands the test for a difference in two population proportions, which is the two-proportion Z-test we learned in Inference for Two Proportions . We use the two-proportion Z-test when the response variable has only two outcome categories and we are comparing two populations (or two subgroups.) We use the test of homogeneity if the response variable has two or more categories and we wish to compare two or more populations (or subgroups.)

We can answer the following research questions with a chi-square test of homogeneity:

  • Does the use of steroids in collegiate athletics differ across the three NCAA divisions?
  • Was the distribution of political views (liberal, moderate, conservative) different for last three presidential elections in the United States?

The null hypothesis states that the distribution of the categorical variable is the same for the populations (or subgroups). In other words, the proportion with a given response is the same in all of the populations, and this is true for all response categories. The alternative hypothesis says that the distributions differ.

Note: Homogeneous means the same in structure or composition. This test gets its name from the null hypothesis, where we claim that the distribution of the responses are the same (homogeneous) across groups.

To test our hypotheses, we select a random sample from each population and gather data on one categorical variable. As with all chi-square tests, the expected counts reflect the null hypothesis. We must determine what we expect to see in each sample if the distributions are identical. As before, the chi-square test statistic measures the amount that the observed counts in the samples deviate from the expected counts.

Steroid Use in Collegiate Sports

In 2006, the NCAA published a report called “Substance Use: NCAA Study of Substance Use of College Student-Athletes.” We use data from this report to investigate the following question: Does steroid use by student athletes differ for the three NCAA divisions?

The data comes from a random selection of teams in each NCAA division. The sampling plan was somewhat complex, but we can view the data as though it came from a random sample of athletes in each division. The surveys are anonymous to encourage truthful responses.

To see the NCAA report on substance use, click here .

Step 1: State the hypotheses.

In the test of homogeneity, the null hypothesis says that the distribution of a categorical response variable is the same in each population. In this example, the categorical response variable is steroid use (yes or no). The populations are the three NCAA divisions.

  • H 0 : The proportion of athletes using steroids is the same in each of the three NCAA divisions.
  • H a : The proportion of athletes using steroids is not same in each of the three NCAA divisions.

Note: These hypotheses imply that the proportion of athletes not using steroids is also the same in each of the three NCAA divisions, so we don’t need to state this explicitly. For example, if 2% of the athletes in each division are using steroids, then 98% are not.

Here is an alternative way we could state the hypotheses for a test of homogeneity.

  • H 0 : For each of the three NCAA divisions, the distribution of “yes” and “no” responses to the question about steroid use is the same.
  • H a : The distribution of responses is not the same.

Step 2: Collect and analyze the data.

We summarized the data from these three samples in a two-way table.

Observed data for the amount of athletes in each division (I, II, and III) who do and do not admit to steroid use

We use percentages to compare the distributions of yes and no responses in the three samples. This step is similar to our data analysis for the test of independence.

Conditional percentages for the number of athletes in each division (I,II, and III) who do and do not admit to steroid use.

We can see that Division I and Division II schools have essentially the same percentage of athletes who admit steroid use (about 1.2%). Not surprisingly, the least competitive division, Division III, has a slightly lower percentage (about 1.0%). Do these results suggest that the proportion of athletes using steroids is the same for the three divisions? Or is the difference seen in the sample of Division III schools large enough to suggest differences in the divisions? After all, the sample sizes are very large. We know that for large samples, a small difference can be statistically significant. Of course, we have to conduct the test of homogeneity to find out.

Note: We decided not to use ribbon charts for visual comparison of the three distributions because the percentage admitting steroid use is too small in each sample to be visible.

Step 3: Assess the evidence.

We need to determine the expected values and the chi-square test statistic so that we can find the P-value.

Calculating Expected Values for a Test of Homogeneity

Expected counts always describe what we expect to see in a sample if the null hypothesis is true. In this situation, we expect the percentage using steroids to be the same for each division. What percentage do we use? We find the percentage using steroids in the combined samples. This calculation is the same as we did when finding expected counts for a test of independence, though the logic of the calculation is subtly different.

Expected counts

Here are the calculations for the response “yes”:

  • Percentage using steroids in combined samples: 220/19,377 = 0.01135 = 1.135%

Expected count of steroid users for Division I is 1.135% of Division I sample:

  • 0.01135(8,543) = 96.96

Expected count of steroid users for Division II is 1.135% of Division II sample:

  • 0.01135(4,341) = 49.27

Expected count of steroid users for Division III is 1.135% of Division III sample:

  • 0.01135(6,493) = 73.70

Checking Conditions

The conditions for use of the chi-square distribution are the same as we learned previously:

  • A sample is randomly selected from each population.
  • All of the expected counts are 5 or greater.

Since this data meets the conditions, we can proceed with calculating the χ 2 test statistic.

Calculating the Chi-Square Test Statistic

There are no changes in the way we calculate the chi-square test statistic.

We use technology to calculate the chi-square value. For this example, we show the calculation. There are six terms, one for each cell in the 3 × 2 table. (We ignore the totals, as always.)

The chi-square value is 1.57

Finding Degrees of Freedom and the P-Value

For chi-square tests based on two-way tables (both the test of independence and the test of homogeneity), the degrees of freedom are ( r − 1)( c − 1), where r is the number of rows and c is the number of columns in the two-way table (not counting row and column totals). In this case, the degrees of freedom are (3 − 1)(2 − 1) = 2.

We use the chi-square distribution with df = 2 to find the P-value. The P-value is large (0.4561), so we fail to reject the null hypothesis.

For 2 degrees of freedom the P-value is 0.4561

Step 4: Conclusion.

The data does not provide strong enough evidence to conclude that steroid use differs in the three NCAA divisions (P-value = 0.4561).

First Use of Anabolic Steroids by NCAA Athletes

The NCAA survey includes this question: “When, if ever, did you start using anabolic steroids?” The response options are: have never used, before junior high, junior high, high school, freshman year of college, after freshman year of college. We focused on those who admitted use of steroids and compared the distribution of their responses for the years 1997, 2001, and 2005. (These are the years that the NCAA conducted the survey. Counts are estimates from reported percentages and sample size.) Recall that the NCAA uses random sampling in its sampling design.

Table of collected data that shows age of initial use of anabolic steroids

Please click here to open the simulation for use in the following activity.

https://assessments.lumenlearning.co...sessments/3798

https://assessments.lumenlearning.co...sessments/3799

https://assessments.lumenlearning.co...sessments/3730

https://assessments.lumenlearning.co...sessments/3731

https://assessments.lumenlearning.co...sessments/3800

https://assessments.lumenlearning.co...sessments/3801

https://assessments.lumenlearning.co...sessments/3802

https://assessments.lumenlearning.co...sessments/3803

We now know the details for the chi-square test for homogeneity. We conclude with two activities that will give you practice recognizing when to use this test.

Gender and Politics

Consider these two situations:

  • A: Liberal, moderate, or conservative: Are there differences in political views of men and women in the United States? We survey a random sample of 100 U.S. men and 100 U.S. women.
  • B: Do you plan to vote in the next presidential election? We ask a random sample of 100 U.S. men and 100 U.S. women. We look for differences in the proportion of men and women planning to vote.

https://assessments.lumenlearning.co...sessments/3732

https://assessments.lumenlearning.co...sessments/3733

Steroid Use for Male Athletes in NCAA Sports

We plan to compare steroid use for male athletes in NCAA baseball, basketball, and football. We design two different sampling plans.

  • A: Survey distinct random samples of NCAA athletes from each sport: 500 baseball players, 400 basketball players, 900 football players.
  • B. Survey a random sample of 1,800 NCAA male athletes and categorize players by sport and admitted steroid use. Responses are anonymous.

https://assessments.lumenlearning.co...sessments/3734

Let’s Summarize

In “Chi-Square Tests for Two-Way Tables,” we discussed two different hypothesis tests using the chi-square test statistic:

  • Test of independence for a two-way table
  • Test of homogeneity for a two-way table

Test of Independence for a Two-Way Table

  • In the test of independence, we consider one population and two categorical variables.
  • In Probability and Probability Distribution , we learned that two events are independent if P ( A | B ) = P ( A ), but we did not pay attention to variability in the sample. With the chi-square test of independence, we have a method for deciding whether our observed P ( A | B ) is “too far” from our observed P ( A ) to infer independence in the population.
  • The null hypothesis says the two variables are independent (or not associated). The alternative hypothesis says the two variables are dependent (or associated).
  • To test our hypotheses, we select a single random sample and gather data for two different categorical variables.
  • Example: Do men and women differ in their perception of their weight? Select a random sample of adults. Ask them two questions: (1) Are you male or female? (2) Do you feel that you are overweight, underweight, or about right in weight?

Test of Homogeneity for a Two-Way Table

  • In the test of homogeneity, we consider two or more populations (or two or more subgroups of a population) and a single categorical variable.
  • The test of homogeneity expands on the test for a difference in two population proportions that we learned in Inference for Two Proportions by comparing the distribution of the categorical variable across multiple groups or populations.
  • The null hypothesis says that the distribution of proportions for all categories is the same in each group or population. The alternative hypothesis says that the distributions differ.
  • To test our hypotheses, we select a random sample from each population or subgroup independently. We gather data for one categorical variable.
  • Example: Is the rate of steroid use different for different men’s collegiate sports (baseball, basketball, football, tennis, track/field)? Randomly select a sample of athletes from each sport and ask them anonymously if they use steroids.

The difference between these two tests is subtle. They differ primarily in study design. In the test of independence, we select individuals at random from a population and record data for two categorical variables. The null hypothesis says that the variables are independent. In the test of homogeneity, we select random samples from each subgroup or population separately and collect data on a single categorical variable. The null hypothesis says that the distribution of the categorical variable is the same for each subgroup or population.

Both tests use the same chi-square test statistic.

Chi-Square Test Statistic and Distribution

For all chi-square tests, the chi-square test statistic χ 2 is the same. It measures how far the observed data are from the null hypothesis by comparing observed counts and expected counts. Expected counts are the counts we expect to see if the null hypothesis is true.

The chi-square model is a family of curves that depend on degrees of freedom. For a two-way table, the degrees of freedom equals ( r − 1)( c − 1). All chi-square curves are skewed to the right with a mean equal to the degrees of freedom.

A chi-square model is a good fit for the distribution of the chi-square test statistic only if the following conditions are met:

  • The sample is randomly selected.
  • All expected counts are 5 or greater.

If these conditions are met, we use the chi-square distribution to find the P-value. We use the same logic that we have used in all hypothesis tests to draw a conclusion based on the P-value. If the P-value is at least as small as the significance level, we reject the null hypothesis and accept the alternative hypothesis. The P-value is the likelihood that results from random samples have a χ 2 value equal to or greater than that calculated from the data if the null hypothesis is true.

Contributors and Attributions

  • Concepts in Statistics. Provided by : Open Learning Initiative. Located at : http://oli.cmu.edu . License : CC BY: Attribution

16.1 Two Categorical Distributions

Two categorical distributions ¶.

To see how two quantitative variables are related, you could use the correlation coefficient to measure linear association. But how should we decide whether two categorical variables are related? For example, how can we decide whether a attribute is related to an individual's class? It's an important question to answer, because if it's not related then you can leave it out of your classifier.

In the breast cancer data, let's see if mitotic activity is related to the class. We have labeled the classes "Cancer" and "Not Cancer" for ease of reference later.

... (673 rows omitted)

We can use pivot and proportions (defined in the previous section) to visualize the distribution of Mitoses in the two classes.

null hypothesis two categorical variables

The distribution of Mitoses for the 'Cancer' class has a long thin tail compared to the distribution for the 'Not Cancer' class which is overwhelmingly at the lowest rating.

So it looks as though class and mitotic activity are related. But could this be just due to chance?

To understand where chance comes in, remember that the data are like a random sample from a larger population – the population that contains the new individuals whom we might want to classify. It could be that in the population, class and mitosis were independent of each other, and just appear to be related in the sample due to chance.

The Hypotheses ¶

Let's try to answer the question by performing a test of the following hypotheses.

Null Hypothesis. In the population, class and mitosis ratings are independent of each other; in other words, the distribution of mitoses is the same for the two classes. The distributions are different in the sample only due to chance.

Alternative Hypothesis. In the population, class and mitosis ratings are related.

To see how to test this, let's look at the data again.

Random Permutations ¶

If class and mitosis ratings are unrelated, then it doesn't matter in what order the Mitoses values appear – since they are not related to the values in Class , all rearrangements should be equally likely. This is the same as the approach that we took when analyzing the football Deflategate data.

So let's shuffle all the Mitoses values into an array called shuffled_mitoses . You can see its first item below, but it contains 683 items because it is a permutation (that is, a rearrangement) of the entire Mitoses column.

Let's augment the table mitoses with a column containing the shuffled values.

Let's look at the distributions of mitoses for the shuffled data, using the same process that we followed with the original data.

The distributions of the shuffled data in the two classes can be visualized in bar charts just as the original data were.

null hypothesis two categorical variables

That looks a bit different from the original bar charts, shown below again for convenience.

null hypothesis two categorical variables

A Test Statistic: Total Variation Distance ¶

We need a test statistic that measures the difference between the blue distribution and the gold. Recall that total variation distance can be used to quantify how different two categorical distributions are.

In the original sample, the total variation distance between the distributions of mitoses in the two classes was about 0.4:

But in the shuffled sample it was quite a bit smaller:

The randomly permuted mitosis ratings and the original ratings don't seem to be behaving the same way. But the random shuffle could come out differently if we run it again. Let's reshuffle and redo the calculation of the total variation distance.

The total variation distance is still quite a bit smaller than the 0.42 we got from the original data. To see how much it could vary, we have to repeat the random shuffling procedure many times, in a process that has by now become familiar.

Empirical Distribution of the TVD, Under the Null Hypothesis ¶

If the null hypothesis were true, all permutations of mitosis ratings would be equally likely. There are large numbers of possible permutations; let's do 5000 of them and see how our test statistic varies. The code is exactly the same as above, except that now we will collect all 5000 distances and draw their empirical histogram.

null hypothesis two categorical variables

The observed total variation distance of 0.42 is nowhere near the distribution generated assuming the null hypothesis is true. The data support the alternative: the mitosis ratings are related to class.

Permutation Test for the Equality of Two Categorical Distributions ¶

The test that we performed above is called a permutation test of the null hypothesis that the two samples are drawn from the same underlying distribution.

To define a function that performs the test, we can just copy the code from the previous cell and change the names of tables and columns. The function permutation_test_tvd takes the name of the data table, the label of the column containing the categorical variable whose distribution the test is about, the label of the column containing the binary class variable, and the number of random permutations to run.

In our example above, we didn't compute a P-value because the observed value was far away from the null distribution of the statistic. In general, however, we should compute the P-value as the statistic might not be so extreme in other examples. The P-value is the chance, assuming that the null is true, of getting a distance as big as or bigger than the distance that was observed, because the alternative hypothesis predicts larger distances than the null.

null hypothesis two categorical variables

Once again, the observed distance of 0.64 is very far away from the distribution predicted by the null hypothesis. The empirical P-value is 0, so the exact P-value will be close to zero. Thus if class and clump thickness were unrelated, the observed data would be hugely unlikely.

So the conclusion is that clump thickness is related to class, not just in the sample but in the population.

We have use permutation tests to help decide whether the distribution of a categorical attribute is related to class. In general, permutation tests can be used in this way to decide whether two categorical distributions were randomly sampled from the same underlying distribution.

results matching " "

No results matching " ".

IMAGES

  1. 15 Null Hypothesis Examples (2024)

    null hypothesis two categorical variables

  2. Null Hypothesis

    null hypothesis two categorical variables

  3. Investigating associations between two categorical variables

    null hypothesis two categorical variables

  4. PPT

    null hypothesis two categorical variables

  5. Analysis of Two Categorical Variables

    null hypothesis two categorical variables

  6. [Solved]-ggplot2 bar plot with two categorical variables-R

    null hypothesis two categorical variables

VIDEO

  1. 02. SPSS Classroom

  2. hypothesis test 1 categorical variable on spss

  3. Variable types and hypothesis testing

  4. Types of Hypothesis in Research Methodology with examples

  5. Testing of Hypothesis for Categorical Data

  6. Chapter 1 and 2

COMMENTS

  1. 8.1

    It will be done using the Chi-Square Test of Independence. As with all prior statistical tests we need to define null and alternative hypotheses. Also, as we have learned, the null hypothesis is what is assumed to be true until we have evidence to go against it. In this lesson, we are interested in researching if two categorical variables are ...

  2. Chi-Square Test of Independence: Definition, Formula, and Example

    A Chi-Square test of independence uses the following null and alternative hypotheses: H0: (null hypothesis) The two variables are independent. H1: (alternative hypothesis) The two variables are not independent. (i.e. they are associated) We use the following formula to calculate the Chi-Square test statistic X2: X2 = Σ (O-E)2 / E.

  3. Chi-Square Test of Independence

    Chi-Square Test of Independence | Formula, Guide & Examples. Published on May 30, 2022 by Shaun Turney.Revised on June 22, 2023. A chi-square (Χ 2) test of independence is a nonparametric hypothesis test.You can use it to test whether two categorical variables are related to each other.. Example: Chi-square test of independence. Imagine a city wants to encourage more of its residents to ...

  4. Chi-Square (Χ²) Tests

    When there are two categorical variables, you can use a specific type of frequency distribution table called a contingency table to show the number of observations in each combination of groups. ... Null hypothesis (H 0): The bird species visit the bird feeder in the same proportions as the average over the past five years.

  5. Chi-Square Test of Independence and an Example

    Null hypothesis: There are no relationships between the categorical variables. If you know the value of one variable, it does not help you predict the value of another variable. ... The chi-squared test handles two categorical variables where each one can have two or more values. And, it tests whether there is an association between the ...

  6. 11.6: Test of Independence (1 of 3)

    The null hypothesis states that the two categorical variables are not related in the population. In other words, the variables are independent. ... Notice also that we collected data on two categorical variables for each student: gender and body image. This is the type of situation that is appropriate for a chi-square test of independence.

  7. Chi-Square Independence Test

    The chi-square independence test evaluates if. two categorical variables are related in some population. Example: a scientist wants to know if education level and marital status are related for all people in some country. He collects data on a simple random sample of n = 300 people, part of which are shown below.

  8. Chi-Square Test of Independence

    The null hypothesis states that knowing the level of Variable A does not help you predict the level of Variable B. That is, the variables are independent. ... chi-square test for independence to determine whether there is a significant relationship between two categorical variables. Analyze Sample Data. Using sample data, find the degrees of ...

  9. Hypothesis Testing

    The null hypothesis states that the two variables (the grouping variable and the outcome) are independent. The definition of independence is as follows: ... We have 3 independent comparison groups (Surgical Apgar Score) and a categorical outcome variable (morbidity/mortality). We can run a Chi-Squared test of independence.

  10. How to use the Chi-Square Test for two categorical variables?

    Null Hypothesis (H0): There is no substantial relationship between the two variables (in case of independence test), ... Chi-square is the test for understanding the relationship between two categorical variables. One can understand whether the dependency and fitness of categorical variables are related to others. With this hands-on ...

  11. What Is Chi Square Test & How To Calculate Formula Equation

    The Chi-square test informs whether there is a significant association between two categorical variables. Suppose the calculated Chi-square value is above the critical value from the Chi-square distribution. In that case, it suggests a significant relationship between the variables, rejecting the null hypothesis of no association.

  12. Chi-Square Test of Independence

    If two categorical variables are independent, then the value of one variable does not change the probability distribution of the other. If two categorical variables are related, then the distribution of one depends on the level the other. ... We failed to reject the null hypothesis and found evidence that treatment and symptoms are independent ...

  13. SAS Tutorials: Chi-Square Test of Independence

    Statistical independence or association between two categorical variables. ... Since the p-value is less than our chosen significance level α = 0.05, we can reject the null hypothesis, and conclude that there is an association between class rank and whether or not students live on-campus.

  14. Chi-squared test

    Chi-squared distribution, showing χ 2 on the x-axis and p-value (right tail probability) on the y-axis.. A chi-squared test (also chi-square or χ 2 test) is a statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine whether two categorical variables (two dimensions of the contingency table ...

  15. Examples of null and alternative hypotheses

    Inference for categorical data: Proportions > The idea of significance tests ... The null hypothesis is often stated as the assumption that there is no change, no difference between two groups, or no relationship between two variables. The alternative hypothesis, on the other hand, is the statement that there is a change, difference, or ...

  16. 9.1: Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. \(H_0\): The null hypothesis: It is a statement of no difference between the variables—they are not related. This can often be considered the status quo and as a result if you cannot accept the null it requires some action.

  17. 11.9: Test of Homogeneity

    Step 1: State the hypotheses. In the test of homogeneity, the null hypothesis says that the distribution of a categorical response variable is the same in each population. In this example, the categorical response variable is steroid use (yes or no). The populations are the three NCAA divisions. H 0: The proportion of athletes using steroids is ...

  18. 16.1 Two Categorical Distributions · GitBook

    The test that we performed above is called a permutation test of the null hypothesis that the two samples are drawn from the same underlying distribution. ... The function permutation_test_tvd takes the name of the data table, the label of the column containing the categorical variable whose distribution the test is about, ...