Hypothesis Testing - Chi Squared Test

Lisa Sullivan, PhD

Professor of Biostatistics

Boston University School of Public Health

Introductory word scramble

Introduction

This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. The hypothesis is based on available information and the investigator's belief about the population parameters. The specific tests considered here are called chi-square tests and are appropriate when the outcome is discrete (dichotomous, ordinal or categorical). For example, in some clinical trials the outcome is a classification such as hypertensive, pre-hypertensive or normotensive. We could use the same classification in an observational study such as the Framingham Heart Study to compare men and women in terms of their blood pressure status - again using the classification of hypertensive, pre-hypertensive or normotensive status.  

The technique to analyze a discrete outcome uses what is called a chi-square test. Specifically, the test statistic follows a chi-square probability distribution. We will consider chi-square tests here with one, two and more than two independent comparison groups.

Learning Objectives

After completing this module, the student will be able to:

  • Perform chi-square tests by hand
  • Appropriately interpret results of chi-square tests
  • Identify the appropriate hypothesis testing procedure based on type of outcome variable and number of samples

Tests with One Sample, Discrete Outcome

Here we consider hypothesis testing with a discrete outcome variable in a single population. Discrete variables are variables that take on more than two distinct responses or categories and the responses can be ordered or unordered (i.e., the outcome can be ordinal or categorical). The procedure we describe here can be used for dichotomous (exactly 2 response options), ordinal or categorical discrete outcomes and the objective is to compare the distribution of responses, or the proportions of participants in each response category, to a known distribution. The known distribution is derived from another study or report and it is again important in setting up the hypotheses that the comparator distribution specified in the null hypothesis is a fair comparison. The comparator is sometimes called an external or a historical control.   

In one sample tests for a discrete outcome, we set up our hypotheses against an appropriate comparator. We select a sample and compute descriptive statistics on the sample data. Specifically, we compute the sample size (n) and the proportions of participants in each response

Test Statistic for Testing H 0 : p 1 = p 10 , p 2 = p 20 , ..., p k = p k0

We find the critical value in a table of probabilities for the chi-square distribution with degrees of freedom (df) = k-1. In the test statistic, O = observed frequency and E=expected frequency in each of the response categories. The observed frequencies are those observed in the sample and the expected frequencies are computed as described below. χ 2 (chi-square) is another probability distribution and ranges from 0 to ∞. The test above statistic formula above is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories.  

When we conduct a χ 2 test, we compare the observed frequencies in each response category to the frequencies we would expect if the null hypothesis were true. These expected frequencies are determined by allocating the sample to the response categories according to the distribution specified in H 0 . This is done by multiplying the observed sample size (n) by the proportions specified in the null hypothesis (p 10 , p 20 , ..., p k0 ). To ensure that the sample size is appropriate for the use of the test statistic above, we need to ensure that the following: min(np 10 , n p 20 , ..., n p k0 ) > 5.  

The test of hypothesis with a discrete outcome measured in a single sample, where the goal is to assess whether the distribution of responses follows a known distribution, is called the χ 2 goodness-of-fit test. As the name indicates, the idea is to assess whether the pattern or distribution of responses in the sample "fits" a specified population (external or historical) distribution. In the next example we illustrate the test. As we work through the example, we provide additional details related to the use of this new test statistic.  

A University conducted a survey of its recent graduates to collect demographic and health information for future planning purposes as well as to assess students' satisfaction with their undergraduate experiences. The survey revealed that a substantial proportion of students were not engaging in regular exercise, many felt their nutrition was poor and a substantial number were smoking. In response to a question on regular exercise, 60% of all graduates reported getting no regular exercise, 25% reported exercising sporadically and 15% reported exercising regularly as undergraduates. The next year the University launched a health promotion campaign on campus in an attempt to increase health behaviors among undergraduates. The program included modules on exercise, nutrition and smoking cessation. To evaluate the impact of the program, the University again surveyed graduates and asked the same questions. The survey was completed by 470 graduates and the following data were collected on the exercise question:

Based on the data, is there evidence of a shift in the distribution of responses to the exercise question following the implementation of the health promotion campaign on campus? Run the test at a 5% level of significance.

In this example, we have one sample and a discrete (ordinal) outcome variable (with three response options). We specifically want to compare the distribution of responses in the sample to the distribution reported the previous year (i.e., 60%, 25%, 15% reporting no, sporadic and regular exercise, respectively). We now run the test using the five-step approach.  

  • Step 1. Set up hypotheses and determine level of significance.

The null hypothesis again represents the "no change" or "no difference" situation. If the health promotion campaign has no impact then we expect the distribution of responses to the exercise question to be the same as that measured prior to the implementation of the program.

H 0 : p 1 =0.60, p 2 =0.25, p 3 =0.15,  or equivalently H 0 : Distribution of responses is 0.60, 0.25, 0.15  

H 1 :   H 0 is false.          α =0.05

Notice that the research hypothesis is written in words rather than in symbols. The research hypothesis as stated captures any difference in the distribution of responses from that specified in the null hypothesis. We do not specify a specific alternative distribution, instead we are testing whether the sample data "fit" the distribution in H 0 or not. With the χ 2 goodness-of-fit test there is no upper or lower tailed version of the test.

  • Step 2. Select the appropriate test statistic.  

The test statistic is:

We must first assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=470 and the proportions specified in the null hypothesis are 0.60, 0.25 and 0.15. Thus, min( 470(0.65), 470(0.25), 470(0.15))=min(282, 117.5, 70.5)=70.5. The sample size is more than adequate so the formula can be used.

  • Step 3. Set up decision rule.  

The decision rule for the χ 2 test depends on the level of significance and the degrees of freedom, defined as degrees of freedom (df) = k-1 (where k is the number of response categories). If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. Critical values can be found in a table of probabilities for the χ 2 distribution. Here we have df=k-1=3-1=2 and a 5% level of significance. The appropriate critical value is 5.99, and the decision rule is as follows: Reject H 0 if χ 2 > 5.99.

  • Step 4. Compute the test statistic.  

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) and the expected frequencies into the formula for the test statistic identified in Step 2. The computations can be organized as follows.

Notice that the expected frequencies are taken to one decimal place and that the sum of the observed frequencies is equal to the sum of the expected frequencies. The test statistic is computed as follows:

  • Step 5. Conclusion.  

We reject H 0 because 8.46 > 5.99. We have statistically significant evidence at α=0.05 to show that H 0 is false, or that the distribution of responses is not 0.60, 0.25, 0.15.  The p-value is p < 0.005.  

In the χ 2 goodness-of-fit test, we conclude that either the distribution specified in H 0 is false (when we reject H 0 ) or that we do not have sufficient evidence to show that the distribution specified in H 0 is false (when we fail to reject H 0 ). Here, we reject H 0 and concluded that the distribution of responses to the exercise question following the implementation of the health promotion campaign was not the same as the distribution prior. The test itself does not provide details of how the distribution has shifted. A comparison of the observed and expected frequencies will provide some insight into the shift (when the null hypothesis is rejected). Does it appear that the health promotion campaign was effective?  

Consider the following: 

If the null hypothesis were true (i.e., no change from the prior year) we would have expected more students to fall in the "No Regular Exercise" category and fewer in the "Regular Exercise" categories. In the sample, 255/470 = 54% reported no regular exercise and 90/470=19% reported regular exercise. Thus, there is a shift toward more regular exercise following the implementation of the health promotion campaign. There is evidence of a statistical difference, is this a meaningful difference? Is there room for improvement?

The National Center for Health Statistics (NCHS) provided data on the distribution of weight (in categories) among Americans in 2002. The distribution was based on specific values of body mass index (BMI) computed as weight in kilograms over height in meters squared. Underweight was defined as BMI< 18.5, Normal weight as BMI between 18.5 and 24.9, overweight as BMI between 25 and 29.9 and obese as BMI of 30 or greater. Americans in 2002 were distributed as follows: 2% Underweight, 39% Normal Weight, 36% Overweight, and 23% Obese. Suppose we want to assess whether the distribution of BMI is different in the Framingham Offspring sample. Using data from the n=3,326 participants who attended the seventh examination of the Offspring in the Framingham Heart Study we created the BMI categories as defined and observed the following:

  • Step 1.  Set up hypotheses and determine level of significance.

H 0 : p 1 =0.02, p 2 =0.39, p 3 =0.36, p 4 =0.23     or equivalently

H 0 : Distribution of responses is 0.02, 0.39, 0.36, 0.23

H 1 :   H 0 is false.        α=0.05

The formula for the test statistic is:

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ..., n p k ) > 5. The sample size here is n=3,326 and the proportions specified in the null hypothesis are 0.02, 0.39, 0.36 and 0.23. Thus, min( 3326(0.02), 3326(0.39), 3326(0.36), 3326(0.23))=min(66.5, 1297.1, 1197.4, 765.0)=66.5. The sample size is more than adequate, so the formula can be used.

Here we have df=k-1=4-1=3 and a 5% level of significance. The appropriate critical value is 7.81 and the decision rule is as follows: Reject H 0 if χ 2 > 7.81.

We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) into the formula for the test statistic identified in Step 2. We organize the computations in the following table.

The test statistic is computed as follows:

We reject H 0 because 233.53 > 7.81. We have statistically significant evidence at α=0.05 to show that H 0 is false or that the distribution of BMI in Framingham is different from the national data reported in 2002, p < 0.005.  

Again, the χ 2   goodness-of-fit test allows us to assess whether the distribution of responses "fits" a specified distribution. Here we show that the distribution of BMI in the Framingham Offspring Study is different from the national distribution. To understand the nature of the difference we can compare observed and expected frequencies or observed and expected proportions (or percentages). The frequencies are large because of the large sample size, the observed percentages of patients in the Framingham sample are as follows: 0.6% underweight, 28% normal weight, 41% overweight and 30% obese. In the Framingham Offspring sample there are higher percentages of overweight and obese persons (41% and 30% in Framingham as compared to 36% and 23% in the national data), and lower proportions of underweight and normal weight persons (0.6% and 28% in Framingham as compared to 2% and 39% in the national data). Are these meaningful differences?

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable in a single population. We presented a test using a test statistic Z to test whether an observed (sample) proportion differed significantly from a historical or external comparator. The chi-square goodness-of-fit test can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square goodness-of-fit test.

The NCHS report indicated that in 2002, 75% of children aged 2 to 17 saw a dentist in the past year. An investigator wants to assess whether use of dental services is similar in children living in the city of Boston. A sample of 125 children aged 2 to 17 living in Boston are surveyed and 64 reported seeing a dentist over the past 12 months. Is there a significant difference in use of dental services between children living in Boston and the national data?

We presented the following approach to the test using a Z statistic. 

  • Step 1. Set up hypotheses and determine level of significance

H 0 : p = 0.75

H 1 : p ≠ 0.75                               α=0.05

We must first check that the sample size is adequate. Specifically, we need to check min(np 0 , n(1-p 0 )) = min( 125(0.75), 125(1-0.75))=min(94, 31)=31. The sample size is more than adequate so the following formula can be used

This is a two-tailed test, using a Z statistic and a 5% level of significance. Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. The sample proportion is:

hypothesis test chi square example

We reject H 0 because -6.15 < -1.960. We have statistically significant evidence at a =0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data. (p < 0.0001).  

We now conduct the same test using the chi-square goodness-of-fit test. First, we summarize our sample data as follows:

H 0 : p 1 =0.75, p 2 =0.25     or equivalently H 0 : Distribution of responses is 0.75, 0.25 

We must assess whether the sample size is adequate. Specifically, we need to check min(np 0 , np 1, ...,np k >) > 5. The sample size here is n=125 and the proportions specified in the null hypothesis are 0.75, 0.25. Thus, min( 125(0.75), 125(0.25))=min(93.75, 31.25)=31.25. The sample size is more than adequate so the formula can be used.

Here we have df=k-1=2-1=1 and a 5% level of significance. The appropriate critical value is 3.84, and the decision rule is as follows: Reject H 0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

(Note that (-6.15) 2 = 37.8, where -6.15 was the value of the Z statistic in the test for proportions shown above.)

We reject H 0 because 37.8 > 3.84. We have statistically significant evidence at α=0.05 to show that there is a statistically significant difference in the use of dental service by children living in Boston as compared to the national data.  (p < 0.0001). This is the same conclusion we reached when we conducted the test using the Z test above. With a dichotomous outcome, Z 2 = χ 2 !   In statistics, there are often several approaches that can be used to test hypotheses. 

Tests for Two or More Independent Samples, Discrete Outcome

Here we extend that application of the chi-square test to the case with two or more independent comparison groups. Specifically, the outcome of interest is discrete with two or more responses and the responses can be ordered or unordered (i.e., the outcome can be dichotomous, ordinal or categorical). We now consider the situation where there are two or more independent comparison groups and the goal of the analysis is to compare the distribution of responses to the discrete outcome variable among several independent comparison groups.  

The test is called the χ 2 test of independence and the null hypothesis is that there is no difference in the distribution of responses to the outcome across comparison groups. This is often stated as follows: The outcome variable and the grouping variable (e.g., the comparison treatments or comparison groups) are independent (hence the name of the test). Independence here implies homogeneity in the distribution of the outcome among comparison groups.    

The null hypothesis in the χ 2 test of independence is often stated in words as: H 0 : The distribution of the outcome is independent of the groups. The alternative or research hypothesis is that there is a difference in the distribution of responses to the outcome variable among the comparison groups (i.e., that the distribution of responses "depends" on the group). In order to test the hypothesis, we measure the discrete outcome variable in each participant in each comparison group. The data of interest are the observed frequencies (or number of participants in each response category in each group). The formula for the test statistic for the χ 2 test of independence is given below.

Test Statistic for Testing H 0 : Distribution of outcome is independent of groups

and we find the critical value in a table of probabilities for the chi-square distribution with df=(r-1)*(c-1).

Here O = observed frequency, E=expected frequency in each of the response categories in each group, r = the number of rows in the two-way table and c = the number of columns in the two-way table.   r and c correspond to the number of comparison groups and the number of response options in the outcome (see below for more details). The observed frequencies are the sample data and the expected frequencies are computed as described below. The test statistic is appropriate for large samples, defined as expected frequencies of at least 5 in each of the response categories in each group.  

The data for the χ 2 test of independence are organized in a two-way table. The outcome and grouping variable are shown in the rows and columns of the table. The sample table below illustrates the data layout. The table entries (blank below) are the numbers of participants in each group responding to each response category of the outcome variable.

Table - Possible outcomes are are listed in the columns; The groups being compared are listed in rows.

In the table above, the grouping variable is shown in the rows of the table; r denotes the number of independent groups. The outcome variable is shown in the columns of the table; c denotes the number of response options in the outcome variable. Each combination of a row (group) and column (response) is called a cell of the table. The table has r*c cells and is sometimes called an r x c ("r by c") table. For example, if there are 4 groups and 5 categories in the outcome variable, the data are organized in a 4 X 5 table. The row and column totals are shown along the right-hand margin and the bottom of the table, respectively. The total sample size, N, can be computed by summing the row totals or the column totals. Similar to ANOVA, N does not refer to a population size here but rather to the total sample size in the analysis. The sample data can be organized into a table like the above. The numbers of participants within each group who select each response option are shown in the cells of the table and these are the observed frequencies used in the test statistic.

The test statistic for the χ 2 test of independence involves comparing observed (sample data) and expected frequencies in each cell of the table. The expected frequencies are computed assuming that the null hypothesis is true. The null hypothesis states that the two variables (the grouping variable and the outcome) are independent. The definition of independence is as follows:

 Two events, A and B, are independent if P(A|B) = P(A), or equivalently, if P(A and B) = P(A) P(B).

The second statement indicates that if two events, A and B, are independent then the probability of their intersection can be computed by multiplying the probability of each individual event. To conduct the χ 2 test of independence, we need to compute expected frequencies in each cell of the table. Expected frequencies are computed by assuming that the grouping variable and outcome are independent (i.e., under the null hypothesis). Thus, if the null hypothesis is true, using the definition of independence:

P(Group 1 and Response Option 1) = P(Group 1) P(Response Option 1).

 The above states that the probability that an individual is in Group 1 and their outcome is Response Option 1 is computed by multiplying the probability that person is in Group 1 by the probability that a person is in Response Option 1. To conduct the χ 2 test of independence, we need expected frequencies and not expected probabilities . To convert the above probability to a frequency, we multiply by N. Consider the following small example.

The data shown above are measured in a sample of size N=150. The frequencies in the cells of the table are the observed frequencies. If Group and Response are independent, then we can compute the probability that a person in the sample is in Group 1 and Response category 1 using:

P(Group 1 and Response 1) = P(Group 1) P(Response 1),

P(Group 1 and Response 1) = (25/150) (62/150) = 0.069.

Thus if Group and Response are independent we would expect 6.9% of the sample to be in the top left cell of the table (Group 1 and Response 1). The expected frequency is 150(0.069) = 10.4.   We could do the same for Group 2 and Response 1:

P(Group 2 and Response 1) = P(Group 2) P(Response 1),

P(Group 2 and Response 1) = (50/150) (62/150) = 0.138.

The expected frequency in Group 2 and Response 1 is 150(0.138) = 20.7.

Thus, the formula for determining the expected cell frequencies in the χ 2 test of independence is as follows:

Expected Cell Frequency = (Row Total * Column Total)/N.

The above computes the expected frequency in one step rather than computing the expected probability first and then converting to a frequency.  

In a prior example we evaluated data from a survey of university graduates which assessed, among other things, how frequently they exercised. The survey was completed by 470 graduates. In the prior example we used the χ 2 goodness-of-fit test to assess whether there was a shift in the distribution of responses to the exercise question following the implementation of a health promotion campaign on campus. We specifically considered one sample (all students) and compared the observed distribution to the distribution of responses the prior year (a historical control). Suppose we now wish to assess whether there is a relationship between exercise on campus and students' living arrangements. As part of the same survey, graduates were asked where they lived their senior year. The response options were dormitory, on-campus apartment, off-campus apartment, and at home (i.e., commuted to and from the university). The data are shown below.

Based on the data, is there a relationship between exercise and student's living arrangement? Do you think where a person lives affect their exercise status? Here we have four independent comparison groups (living arrangement) and a discrete (ordinal) outcome variable with three response options. We specifically want to test whether living arrangement and exercise are independent. We will run the test using the five-step approach.  

H 0 : Living arrangement and exercise are independent

H 1 : H 0 is false.                α=0.05

The null and research hypotheses are written in words rather than in symbols. The research hypothesis is that the grouping variable (living arrangement) and the outcome variable (exercise) are dependent or related.   

  • Step 2.  Select the appropriate test statistic.  

The condition for appropriate use of the above test statistic is that each expected frequency is at least 5. In Step 4 we will compute the expected frequencies and we will ensure that the condition is met.

The decision rule depends on the level of significance and the degrees of freedom, defined as df = (r-1)(c-1), where r and c are the numbers of rows and columns in the two-way data table.   The row variable is the living arrangement and there are 4 arrangements considered, thus r=4. The column variable is exercise and 3 responses are considered, thus c=3. For this test, df=(4-1)(3-1)=3(2)=6. Again, with χ 2 tests there are no upper, lower or two-tailed tests. If the null hypothesis is true, the observed and expected frequencies will be close in value and the χ 2 statistic will be close to zero. If the null hypothesis is false, then the χ 2 statistic will be large. The rejection region for the χ 2 test of independence is always in the upper (right-hand) tail of the distribution. For df=6 and a 5% level of significance, the appropriate critical value is 12.59 and the decision rule is as follows: Reject H 0 if c 2 > 12.59.

We now compute the expected frequencies using the formula,

Expected Frequency = (Row Total * Column Total)/N.

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency.   The expected frequencies are shown in parentheses.

Notice that the expected frequencies are taken to one decimal place and that the sums of the observed frequencies are equal to the sums of the expected frequencies in each row and column of the table.  

Recall in Step 2 a condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 9.6) and therefore it is appropriate to use the test statistic.

We reject H 0 because 60.5 > 12.59. We have statistically significant evidence at a =0.05 to show that H 0 is false or that living arrangement and exercise are not independent (i.e., they are dependent or related), p < 0.005.  

Again, the χ 2 test of independence is used to test whether the distribution of the outcome variable is similar across the comparison groups. Here we rejected H 0 and concluded that the distribution of exercise is not independent of living arrangement, or that there is a relationship between living arrangement and exercise. The test provides an overall assessment of statistical significance. When the null hypothesis is rejected, it is important to review the sample data to understand the nature of the relationship. Consider again the sample data. 

Because there are different numbers of students in each living situation, it makes the comparisons of exercise patterns difficult on the basis of the frequencies alone. The following table displays the percentages of students in each exercise category by living arrangement. The percentages sum to 100% in each row of the table. For comparison purposes, percentages are also shown for the total sample along the bottom row of the table.

From the above, it is clear that higher percentages of students living in dormitories and in on-campus apartments reported regular exercise (31% and 23%) as compared to students living in off-campus apartments and at home (10% each).  

Test Yourself

 Pancreaticoduodenectomy (PD) is a procedure that is associated with considerable morbidity. A study was recently conducted on 553 patients who had a successful PD between January 2000 and December 2010 to determine whether their Surgical Apgar Score (SAS) is related to 30-day perioperative morbidity and mortality. The table below gives the number of patients experiencing no, minor, or major morbidity by SAS category.  

Question: What would be an appropriate statistical test to examine whether there is an association between Surgical Apgar Score and patient outcome? Using 14.13 as the value of the test statistic for these data, carry out the appropriate test at a 5% level of significance. Show all parts of your test.

In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable and two independent comparison groups. We presented a test using a test statistic Z to test for equality of independent proportions. The chi-square test of independence can also be used with a dichotomous outcome and the results are mathematically equivalent.  

In the prior module, we considered the following example. Here we show the equivalence to the chi-square test of independence.

A randomized trial is designed to evaluate the effectiveness of a newly developed pain reliever designed to reduce pain in patients following joint replacement surgery. The trial compares the new pain reliever to the pain reliever currently in use (called the standard of care). A total of 100 patients undergoing joint replacement surgery agreed to participate in the trial. Patients were randomly assigned to receive either the new pain reliever or the standard pain reliever following surgery and were blind to the treatment assignment. Before receiving the assigned treatment, patients were asked to rate their pain on a scale of 0-10 with higher scores indicative of more pain. Each patient was then given the assigned treatment and after 30 minutes was again asked to rate their pain on the same scale. The primary outcome was a reduction in pain of 3 or more scale points (defined by clinicians as a clinically meaningful reduction). The following data were observed in the trial.

We tested whether there was a significant difference in the proportions of patients reporting a meaningful reduction (i.e., a reduction of 3 or more scale points) using a Z statistic, as follows. 

H 0 : p 1 = p 2    

H 1 : p 1 ≠ p 2                             α=0.05

Here the new or experimental pain reliever is group 1 and the standard pain reliever is group 2.

We must first check that the sample size is adequate. Specifically, we need to ensure that we have at least 5 successes and 5 failures in each comparison group or that:

In this example, we have

Therefore, the sample size is adequate, so the following formula can be used:

Reject H 0 if Z < -1.960 or if Z > 1.960.

We now substitute the sample data into the formula for the test statistic identified in Step 2. We first compute the overall proportion of successes:

We now substitute to compute the test statistic.

  • Step 5.  Conclusion.  

We now conduct the same test using the chi-square test of independence.  

H 0 : Treatment and outcome (meaningful reduction in pain) are independent

H 1 :   H 0 is false.         α=0.05

The formula for the test statistic is:  

For this test, df=(2-1)(2-1)=1. At a 5% level of significance, the appropriate critical value is 3.84 and the decision rule is as follows: Reject H0 if χ 2 > 3.84. (Note that 1.96 2 = 3.84, where 1.96 was the critical value used in the Z test for proportions shown above.)

We now compute the expected frequencies using:

The computations can be organized in a two-way table. The top number in each cell of the table is the observed frequency and the bottom number is the expected frequency. The expected frequencies are shown in parentheses.

A condition for the appropriate use of the test statistic was that each expected frequency is at least 5. This is true for this sample (the smallest expected frequency is 22.0) and therefore it is appropriate to use the test statistic.

(Note that (2.53) 2 = 6.4, where 2.53 was the value of the Z statistic in the test for proportions shown above.)

Chi-Squared Tests in R

The video below by Mike Marin demonstrates how to perform chi-squared tests in the R programming language.

Answer to Problem on Pancreaticoduodenectomy and Surgical Apgar Scores

We have 3 independent comparison groups (Surgical Apgar Score) and a categorical outcome variable (morbidity/mortality). We can run a Chi-Squared test of independence.

H 0 : Apgar scores and patient outcome are independent of one another.

H A : Apgar scores and patient outcome are not independent.

Chi-squared = 14.3

Since 14.3 is greater than 9.49, we reject H 0.

There is an association between Apgar scores and patient outcome. The lowest Apgar score group (0 to 4) experienced the highest percentage of major morbidity or mortality (16 out of 57=28%) compared to the other Apgar score groups.

Tutorial Playlist

Statistics tutorial, everything you need to know about the probability density function in statistics, the best guide to understand central limit theorem, an in-depth guide to measures of central tendency : mean, median and mode, the ultimate guide to understand conditional probability, a comprehensive look at percentile in statistics, the best guide to understand bayes theorem, everything you need to know about the normal distribution, an in-depth explanation of cumulative distribution function, a complete guide to chi-square test.

A Complete Guide on Hypothesis Testing in Statistics

Understanding the Fundamentals of Arithmetic and Geometric Progression

The definitive guide to understand spearman’s rank correlation, a comprehensive guide to understand mean squared error, all you need to know about the empirical rule in statistics, the complete guide to skewness and kurtosis, a holistic look at bernoulli distribution, all you need to know about bias in statistics, a complete guide to get a grasp of time series analysis.

The Key Differences Between Z-Test Vs. T-Test

The Complete Guide to Understand Pearson's Correlation

A complete guide on the types of statistical studies, everything you need to know about poisson distribution, your best guide to understand correlation vs. regression, the most comprehensive guide for beginners on what is correlation, what is a chi-square test formula, examples & application.

Lesson 9 of 24 By Avijeet Biswal

A Complete Guide to Chi-Square Test

Table of Contents

The world is constantly curious about the Chi-Square test's application in machine learning and how it makes a difference. Feature selection is a critical topic in machine learning , as you will have multiple features in line and must choose the best ones to build the model. By examining the relationship between the elements, the chi-square test aids in the solution of feature selection problems. In this tutorial, you will learn about the chi-square test and its application.

What Is a Chi-Square Test?

The Chi-Square test is a statistical procedure for determining the difference between observed and expected data. This test can also be used to determine whether it correlates to the categorical variables in our data. It helps to find out whether a difference between two categorical variables is due to chance or a relationship between them.

Chi-Square Test Definition

A chi-square test is a statistical test that is used to compare observed and expected results. The goal of this test is to identify whether a disparity between actual and predicted data is due to chance or to a link between the variables under consideration. As a result, the chi-square test is an ideal choice for aiding in our understanding and interpretation of the connection between our two categorical variables.

A chi-square test or comparable nonparametric test is required to test a hypothesis regarding the distribution of a categorical variable. Categorical variables, which indicate categories such as animals or countries, can be nominal or ordinal. They cannot have a normal distribution since they can only have a few particular values.

For example, a meal delivery firm in India wants to investigate the link between gender, geography, and people's food preferences.

It is used to calculate the difference between two categorical variables, which are:

  • As a result of chance or
  • Because of the relationship

Your Data Analytics Career is Around The Corner!

Your Data Analytics Career is Around The Corner!

Formula For Chi-Square Test

Chi_Sq_formula.

c = Degrees of freedom

O = Observed Value

E = Expected Value

The degrees of freedom in a statistical calculation represent the number of variables that can vary in a calculation. The degrees of freedom can be calculated to ensure that chi-square tests are statistically valid. These tests are frequently used to compare observed data with data that would be expected to be obtained if a particular hypothesis were true.

The Observed values are those you gather yourselves.

The expected values are the frequencies expected, based on the null hypothesis. 

Fundamentals of Hypothesis Testing

Hypothesis testing is a technique for interpreting and drawing inferences about a population based on sample data. It aids in determining which sample data best support mutually exclusive population claims.

Null Hypothesis (H0) - The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.

H0 is the symbol for it, and it is pronounced H-naught.

Alternate Hypothesis(H1 or Ha) - The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

What Are Categorical Variables?

Categorical variables belong to a subset of variables that can be divided into discrete categories. Names or labels are the most common categories. These variables are also known as qualitative variables because they depict the variable's quality or characteristics.

Categorical variables can be divided into two categories:

  • Nominal Variable: A nominal variable's categories have no natural ordering. Example: Gender, Blood groups
  • Ordinal Variable: A variable that allows the categories to be sorted is ordinal variables. Customer satisfaction (Excellent, Very Good, Good, Average, Bad, and so on) is an example.

Why Do You Use the Chi-Square Test?

Chi-square is a statistical test that examines the differences between categorical variables from a random sample in order to determine whether the expected and observed results are well-fitting.

Here are some of the uses of the Chi-Squared test:

  • The Chi-squared test can be used to see if your data follows a well-known theoretical probability distribution like the Normal or Poisson distribution.
  • The Chi-squared test allows you to assess your trained regression model's goodness of fit on the training, validation, and test data sets.

Become an Expert in Data Analytics!

Become an Expert in Data Analytics!

What Does A Chi-Square Statistic Test Tell You?

A Chi-Square test ( symbolically represented as  2 ) is fundamentally a data analysis based on the observations of a random set of variables. It computes how a model equates to actual observed data. A Chi-Square statistic test is calculated based on the data, which must be raw, random, drawn from independent variables, drawn from a wide-ranging sample and mutually exclusive. In simple terms, two sets of statistical data are compared -for instance, the results of tossing a fair coin. Karl Pearson introduced this test in 1900 for categorical data analysis and distribution. This test is also known as ‘Pearson’s Chi-Squared Test’. 

Chi-Squared Tests are most commonly used in hypothesis testing. A hypothesis is an assumption that any given condition might be true, which can be tested afterwards. The Chi-Square test estimates the size of inconsistency between the expected results and the actual results when the size of the sample and the number of variables in the relationship is mentioned. 

These tests use degrees of freedom to determine if a particular null hypothesis can be rejected based on the total number of observations made in the experiments. Larger the sample size, more reliable is the result.

There are two main types of Chi-Square tests namely -

Independence 

  • Goodness-of-Fit 

The Chi-Square Test of Independence is a derivable ( also known as inferential ) statistical test which examines whether the two sets of variables are likely to be related with each other or not. This test is used when we have counts of values for two nominal or categorical variables and is considered as non-parametric test. A relatively large sample size and independence of obseravations are the required criteria for conducting this test.

For Example- 

In a movie theatre, suppose we made a list of movie genres. Let us consider this as the first variable. The second variable is whether or not the people who came to watch those genres of movies have bought snacks at the theatre. Here the null hypothesis is that th genre of the film and whether people bought snacks or not are unrelatable. If this is true, the movie genres don’t impact snack sales. 

Future-Proof Your AI/ML Career: Top Dos and Don'ts

Future-Proof Your AI/ML Career: Top Dos and Don'ts

Goodness-Of-Fit

In statistical hypothesis testing, the Chi-Square Goodness-of-Fit test determines whether a variable is likely to come from a given distribution or not. We must have a set of data values and the idea of the distribution of this data. We can use this test when we have value counts for categorical variables. This test demonstrates a way of deciding if the data values have a “ good enough” fit for our idea or if it is a representative sample data of the entire population. 

Suppose we have bags of balls with five different colours in each bag. The given condition is that the bag should contain an equal number of balls of each colour. The idea we would like to test here is that the proportions of the five colours of balls in each bag must be exact. 

Who Uses Chi-Square Analysis?

Chi-square is most commonly used by researchers who are studying survey response data because it applies to categorical variables. Demography, consumer and marketing research, political science, and economics are all examples of this type of research.

Let's say you want to know if gender has anything to do with political party preference. You poll 440 voters in a simple random sample to find out which political party they prefer. The results of the survey are shown in the table below:

chi-1.

To see if gender is linked to political party preference, perform a Chi-Square test of independence using the steps below.

Step 1: Define the Hypothesis

H0: There is no link between gender and political party preference.

H1: There is a link between gender and political party preference.

Step 2: Calculate the Expected Values

Now you will calculate the expected frequency.

Chi_Sq_formula_1.

For example, the expected value for Male Republicans is: 

Chi_Sq_formula_2

Similarly, you can calculate the expected value for each of the cells.

chi-2.

Step 3: Calculate (O-E)2 / E for Each Cell in the Table

Now you will calculate the (O - E)2 / E for each cell in the table.

chi-3.

Step 4: Calculate the Test Statistic X2

X2  is the sum of all the values in the last table

 =  0.743 + 2.05 + 2.33 + 3.33 + 0.384 + 1

Before you can conclude, you must first determine the critical statistic, which requires determining our degrees of freedom. The degrees of freedom in this case are equal to the table's number of columns minus one multiplied by the table's number of rows minus one, or (r-1) (c-1). We have (3-1)(2-1) = 2.

Finally, you compare our obtained statistic to the critical statistic found in the chi-square table. As you can see, for an alpha level of 0.05 and two degrees of freedom, the critical statistic is 5.991, which is less than our obtained statistic of 9.83. You can reject our null hypothesis because the critical statistic is higher than your obtained statistic.

This means you have sufficient evidence to say that there is an association between gender and political party preference.

Chi_Sq_formula_3

When to Use a Chi-Square Test?

A Chi-Square Test is used to examine whether the observed results are in order with the expected values. When the data to be analysed is from a random sample, and when the variable is the question is a categorical variable, then Chi-Square proves the most appropriate test for the same. A categorical variable consists of selections such as breeds of dogs, types of cars, genres of movies, educational attainment, male v/s female etc. Survey responses and questionnaires are the primary sources of these types of data. The Chi-square test is most commonly used for analysing this kind of data. This type of analysis is helpful for researchers who are studying survey response data. The research can range from customer and marketing research to political sciences and economics. 

Start your Dream Career with the Best Resources!

Start your Dream Career with the Best Resources!

Chi-Square Distribution 

Chi-square distributions (X2) are a type of continuous probability distribution. They're commonly utilized in hypothesis testing, such as the chi-square goodness of fit and independence tests. The parameter k, which represents the degrees of freedom, determines the shape of a chi-square distribution.

A chi-square distribution is followed by very few real-world observations. The objective of chi-square distributions is to test hypotheses, not to describe real-world distributions. In contrast, most other commonly used distributions, such as normal and Poisson distributions, may explain important things like baby birth weights or illness cases per year.

Because of its close resemblance to the conventional normal distribution, chi-square distributions are excellent for hypothesis testing. Many essential statistical tests rely on the conventional normal distribution.

In statistical analysis , the Chi-Square distribution is used in many hypothesis tests and is determined by the parameter k degree of freedoms. It belongs to the family of continuous probability distributions . The Sum of the squares of the k independent standard random variables is called the Chi-Squared distribution. Pearson’s Chi-Square Test formula is - 

Chi_Square_Distribution_1

Where X^2 is the Chi-Square test symbol

Σ is the summation of observations

O is the observed results

E is the expected results 

The shape of the distribution graph changes with the increase in the value of k, i.e. degree of freedoms. 

When k is 1 or 2, the Chi-square distribution curve is shaped like a backwards ‘J’. It means there is a high chance that X^2 becomes close to zero. 

Courtesy: Scribbr

When k is greater than 2, the shape of the distribution curve looks like a hump and has a low probability that X^2 is very near to 0 or very far from 0. The distribution occurs much longer on the right-hand side and shorter on the left-hand side. The probable value of X^2 is (X^2 - 2).

When k is greater than ninety, a normal distribution is seen, approximating the Chi-square distribution.

The Ultimate Ticket to Top Data Science Job Roles

The Ultimate Ticket to Top Data Science Job Roles

Chi-Square P-Values

Here P denotes the probability; hence for the calculation of p-values, the Chi-Square test comes into the picture. The different p-values indicate different types of hypothesis interpretations. 

  • P <= 0.05 (Hypothesis interpretations are rejected)
  • P>= 0.05 (Hypothesis interpretations are accepted) 

The concepts of probability and statistics are entangled with Chi-Square Test. Probability is the estimation of something that is most likely to happen. Simply put, it is the possibility of an event or outcome of the sample. Probability can understandably represent bulky or complicated data. And statistics involves collecting and organising, analysing, interpreting and presenting the data. 

Finding P-Value

When you run all of the Chi-square tests, you'll get a test statistic called X2. You have two options for determining whether this test statistic is statistically significant at some alpha level:

  • Compare the test statistic X2 to a critical value from the Chi-square distribution table.
  • Compare the p-value of the test statistic X2 to a chosen alpha level.

Test statistics are calculated by taking into account the sampling distribution of the test statistic under the null hypothesis, the sample data, and the approach which is chosen for performing the test. 

The p-value will be as mentioned in the following cases.

  • A lower-tailed test is specified by: P(TS ts | H0 is true) p-value = cdf (ts)
  • Lower-tailed tests have the following definition: P(TS ts | H0 is true) p-value = cdf (ts)
  • A two-sided test is defined as follows, if we assume that the test static distribution  of H0 is symmetric about 0. 2 * P(TS |ts| | H0 is true) = 2 * (1 - cdf(|ts|))

P: probability Event

TS: Test statistic is computed observed value of the test statistic from your sample cdf(): Cumulative distribution function of the test statistic's distribution (TS)

Types of Chi-square Tests

Pearson's chi-square tests are classified into two types:

  • Chi-square goodness-of-fit analysis
  • Chi-square independence test

These are, mathematically, the same exam. However, because they are utilized for distinct goals, we generally conceive of them as separate tests.

The chi-square test has the following significant properties:

  • If you multiply the number of degrees of freedom by two, you will receive an answer that is equal to the variance.
  • The chi-square distribution curve approaches the data is normally distributed as the degree of freedom increases.
  • The mean distribution is equal to the number of degrees of freedom.

Properties of Chi-Square Test 

  • Variance is double the times the number of degrees of freedom.
  • Mean distribution is equal to the number of degrees of freedom.
  • When the degree of freedom increases, the Chi-Square distribution curve becomes normal.

Limitations of Chi-Square Test

There are two limitations to using the chi-square test that you should be aware of. 

  • The chi-square test, for starters, is extremely sensitive to sample size. Even insignificant relationships can appear statistically significant when a large enough sample is used. Keep in mind that "statistically significant" does not always imply "meaningful" when using the chi-square test.
  • Be mindful that the chi-square can only determine whether two variables are related. It does not necessarily follow that one variable has a causal relationship with the other. It would require a more detailed analysis to establish causality.

Get In-Demand Skills to Launch Your Data Career

Get In-Demand Skills to Launch Your Data Career

Chi-Square Goodness of Fit Test

When there is only one categorical variable, the chi-square goodness of fit test can be used. The frequency distribution of the categorical variable is evaluated for determining whether it differs significantly from what you expected. The idea is that the categories will have equal proportions, however, this is not always the case.

When you want to see if there is a link between two categorical variables, you perform the chi-square test. To acquire the test statistic and its related p-value in SPSS, use the chisq option on the statistics subcommand of the crosstabs command. Remember that the chi-square test implies that each cell's anticipated value is five or greater.

In this tutorial titled ‘The Complete Guide to Chi-square test’, you explored the concept of Chi-square distribution and how to find the related values. You also take a look at how the critical value and chi-square value is related to each other.

If you want to gain more insight and get a work-ready understanding in statistical concepts and learn how to use them to get into a career in Data Analytics , our Post Graduate Program in Data Analytics in partnership with Purdue University should be your next stop. A comprehensive program with training from top practitioners and in collaboration with IBM, this will be all that you need to kickstart your career in the field. 

Was this tutorial on the Chi-square test useful to you? Do you have any doubts or questions for us? Mention them in this article's comments section, and we'll have our experts answer them for you at the earliest!

1) What is the chi-square test used for? 

The chi-square test is a statistical method used to determine if there is a significant association between two categorical variables. It helps researchers understand whether the observed distribution of data differs from the expected distribution, allowing them to assess whether any relationship exists between the variables being studied.

2) What is the chi-square test and its types? 

The chi-square test is a statistical test used to analyze categorical data and assess the independence or association between variables. There are two main types of chi-square tests: a) Chi-square test of independence: This test determines whether there is a significant association between two categorical variables. b) Chi-square goodness-of-fit test: This test compares the observed data to the expected data to assess how well the observed data fit the expected distribution.

3) What is the chi-square test easily explained? 

The chi-square test is a statistical tool used to check if two categorical variables are related or independent. It helps us understand if the observed data differs significantly from the expected data. By comparing the two datasets, we can draw conclusions about whether the variables have a meaningful association.

4) What is the difference between t-test and chi-square? 

The t-test and the chi-square test are two different statistical tests used for different types of data. The t-test is used to compare the means of two groups and is suitable for continuous numerical data. On the other hand, the chi-square test is used to examine the association between two categorical variables. It is applicable to discrete, categorical data. So, the choice between the t-test and chi-square test depends on the nature of the data being analyzed.

5) What are the characteristics of chi-square? 

The chi-square test has several key characteristics:

1) It is non-parametric, meaning it does not assume a specific probability distribution for the data.

2) It is sensitive to sample size; larger samples can result in more significant outcomes.

3) It works with categorical data and is used for hypothesis testing and analyzing associations.

4) The test output provides a p-value, which indicates the level of significance for the observed relationship between variables.

5)It can be used with different levels of significance (e.g., 0.05 or 0.01) to determine statistical significance.

Find our Data Analyst Online Bootcamp in top cities:

About the author.

Avijeet Biswal

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.

Recommended Resources

A Complete Guide on Hypothesis Testing in Statistics

Getting Started with Google Display Network: The Ultimate Beginner’s Guide

Sanity Testing Vs Smoke Testing: Know the Differences, Applications, and Benefits Of Each

Sanity Testing Vs Smoke Testing: Know the Differences, Applications, and Benefits Of Each

Fundamentals of Software Testing

Fundamentals of Software Testing

The Key Differences Between Z-Test Vs. T-Test

The Building Blocks of API Development

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.

Chi-Square (Χ²) Test & How To Calculate Formula Equation

Benjamin Frimodig

Science Expert

B.A., History and Science, Harvard University

Ben Frimodig is a 2021 graduate of Harvard College, where he studied the History of Science.

Learn about our Editorial Process

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

On This Page:

Chi-square (χ2) is used to test hypotheses about the distribution of observations into categories with no inherent ranking.

What Is a Chi-Square Statistic?

The Chi-square test (pronounced Kai) looks at the pattern of observations and will tell us if certain combinations of the categories occur more frequently than we would expect by chance, given the total number of times each category occurred.

It looks for an association between the variables. We cannot use a correlation coefficient to look for the patterns in this data because the categories often do not form a continuum.

There are three main types of Chi-square tests, tests of goodness of fit, the test of independence, and the test for homogeneity. All three tests rely on the same formula to compute a test statistic.

These tests function by deciphering relationships between observed sets of data and theoretical or “expected” sets of data that align with the null hypothesis.

What is a Contingency Table?

Contingency tables (also known as two-way tables) are grids in which Chi-square data is organized and displayed. They provide a basic picture of the interrelation between two variables and can help find interactions between them.

In contingency tables, one variable and each of its categories are listed vertically, and the other variable and each of its categories are listed horizontally.

Additionally, including column and row totals, also known as “marginal frequencies,” will help facilitate the Chi-square testing process.

In order for the Chi-square test to be considered trustworthy, each cell of your expected contingency table must have a value of at least five.

Each Chi-square test will have one contingency table representing observed counts (see Fig. 1) and one contingency table representing expected counts (see Fig. 2).

contingency table representing observed counts

Figure 1. Observed table (which contains the observed counts).

To obtain the expected frequencies for any cell in any cross-tabulation in which the two variables are assumed independent, multiply the row and column totals for that cell and divide the product by the total number of cases in the table.

contingency table representing observed counts

Figure 2. Expected table (what we expect the two-way table to look like if the two categorical variables are independent).

To decide if our calculated value for χ2 is significant, we also need to work out the degrees of freedom for our contingency table using the following formula: df= (rows – 1) x (columns – 1).

Formula Calculation

chi-squared-equation

Calculate the chi-square statistic (χ2) by completing the following steps:

  • Calculate the expected frequencies and the observed frequencies.
  • For each observed number in the table, subtract the corresponding expected number (O — E).
  • Square the difference (O —E)².
  • Divide the squares obtained for each cell in the table by the expected number for that cell (O – E)² / E.
  • Sum all the values for (O – E)² / E. This is the chi-square statistic.
  • Calculate the degrees of freedom for the contingency table using the following formula; df= (rows – 1) x (columns – 1).

Once we have calculated the degrees of freedom (df) and the chi-squared value (χ2), we can use the χ2 table (often at the back of a statistics book) to check if our value for χ2 is higher than the critical value given in the table. If it is, then our result is significant at the level given.

Interpretation

The chi-square statistic tells you how much difference exists between the observed count in each table cell to the counts you would expect if there were no relationship at all in the population.

Small Chi-Square Statistic: If the chi-square statistic is small and the p-value is large (usually greater than 0.05), this often indicates that the observed frequencies in the sample are close to what would be expected under the null hypothesis.

The null hypothesis usually states no association between the variables being studied or that the observed distribution fits the expected distribution.

In theory, if the observed and expected values were equal (no difference), then the chi-square statistic would be zero — but this is unlikely to happen in real life.

Large Chi-Square Statistic : If the chi-square statistic is large and the p-value is small (usually less than 0.05), then the conclusion is often that the data does not fit the model well, i.e., the observed and expected values are significantly different. This often leads to the rejection of the null hypothesis.

How to Report

To report a chi-square output in an APA-style results section, always rely on the following template:

χ2 ( degrees of freedom , N = sample size ) = chi-square statistic value , p = p value .

chi-squared-spss output

In the case of the above example, the results would be written as follows:

A chi-square test of independence showed that there was a significant association between gender and post-graduation education plans, χ2 (4, N = 101) = 54.50, p < .001.

APA Style Rules

  • Do not use a zero before a decimal when the statistic cannot be greater than 1 (proportion, correlation, level of statistical significance).
  • Report exact p values to two or three decimals (e.g., p = .006, p = .03).
  • However, report p values less than .001 as “ p < .001.”
  • Put a space before and after a mathematical operator (e.g., minus, plus, greater than, less than, equals sign).
  • Do not repeat statistics in both the text and a table or figure.

p -value Interpretation

You test whether a given χ2 is statistically significant by testing it against a table of chi-square distributions , according to the number of degrees of freedom for your sample, which is the number of categories minus 1. The chi-square assumes that you have at least 5 observations per category.

If you are using SPSS then you will have an expected p -value.

For a chi-square test, a p-value that is less than or equal to the .05 significance level indicates that the observed values are different to the expected values.

Thus, low p-values (p< .05) indicate a likely difference between the theoretical population and the collected sample. You can conclude that a relationship exists between the categorical variables.

Remember that p -values do not indicate the odds that the null hypothesis is true but rather provide the probability that one would obtain the sample distribution observed (or a more extreme distribution) if the null hypothesis was true.

A level of confidence necessary to accept the null hypothesis can never be reached. Therefore, conclusions must choose to either fail to reject the null or accept the alternative hypothesis, depending on the calculated p-value.

The four steps below show you how to analyze your data using a chi-square goodness-of-fit test in SPSS (when you have hypothesized that you have equal expected proportions).

Step 1 : Analyze > Nonparametric Tests > Legacy Dialogs > Chi-square… on the top menu as shown below:

Step 2 : Move the variable indicating categories into the “Test Variable List:” box.

Step 3 : If you want to test the hypothesis that all categories are equally likely, click “OK.”

Step 4 : Specify the expected count for each category by first clicking the “Values” button under “Expected Values.”

Step 5 : Then, in the box to the right of “Values,” enter the expected count for category one and click the “Add” button. Now enter the expected count for category two and click “Add.” Continue in this way until all expected counts have been entered.

Step 6 : Then click “OK.”

The four steps below show you how to analyze your data using a chi-square test of independence in SPSS Statistics.

Step 1 : Open the Crosstabs dialog (Analyze > Descriptive Statistics > Crosstabs).

Step 2 : Select the variables you want to compare using the chi-square test. Click one variable in the left window and then click the arrow at the top to move the variable. Select the row variable and the column variable.

Step 3 : Click Statistics (a new pop-up window will appear). Check Chi-square, then click Continue.

Step 4 : (Optional) Check the box for Display clustered bar charts.

Step 5 : Click OK.

Goodness-of-Fit Test

The Chi-square goodness of fit test is used to compare a randomly collected sample containing a single, categorical variable to a larger population.

This test is most commonly used to compare a random sample to the population from which it was potentially collected.

The test begins with the creation of a null and alternative hypothesis. In this case, the hypotheses are as follows:

Null Hypothesis (Ho) : The null hypothesis (Ho) is that the observed frequencies are the same (except for chance variation) as the expected frequencies. The collected data is consistent with the population distribution.

Alternative Hypothesis (Ha) : The collected data is not consistent with the population distribution.

The next step is to create a contingency table that represents how the data would be distributed if the null hypothesis were exactly correct.

The sample’s overall deviation from this theoretical/expected data will allow us to draw a conclusion, with a more severe deviation resulting in smaller p-values.

Test for Independence

The Chi-square test for independence looks for an association between two categorical variables within the same population.

Unlike the goodness of fit test, the test for independence does not compare a single observed variable to a theoretical population but rather two variables within a sample set to one another.

The hypotheses for a Chi-square test of independence are as follows:

Null Hypothesis (Ho) : There is no association between the two categorical variables in the population of interest.

Alternative Hypothesis (Ha) : There is no association between the two categorical variables in the population of interest.

The next step is to create a contingency table of expected values that reflects how a data set that perfectly aligns the null hypothesis would appear.

The simplest way to do this is to calculate the marginal frequencies of each row and column; the expected frequency of each cell is equal to the marginal frequency of the row and column that corresponds to a given cell in the observed contingency table divided by the total sample size.

Test for Homogeneity

The Chi-square test for homogeneity is organized and executed exactly the same as the test for independence.

The main difference to remember between the two is that the test for independence looks for an association between two categorical variables within the same population, while the test for homogeneity determines if the distribution of a variable is the same in each of several populations (thus allocating population itself as the second categorical variable).

Null Hypothesis (Ho) : There is no difference in the distribution of a categorical variable for several populations or treatments.

Alternative Hypothesis (Ha) : There is a difference in the distribution of a categorical variable for several populations or treatments.

The difference between these two tests can be a bit tricky to determine, especially in the practical applications of a Chi-square test. A reliable rule of thumb is to determine how the data was collected.

If the data consists of only one random sample with the observations classified according to two categorical variables, it is a test for independence. If the data consists of more than one independent random sample, it is a test for homogeneity.

What is the chi-square test?

The Chi-square test is a non-parametric statistical test used to determine if there’s a significant association between two or more categorical variables in a sample.

It works by comparing the observed frequencies in each category of a cross-tabulation with the frequencies expected under the null hypothesis, which assumes there is no relationship between the variables.

This test is often used in fields like biology, marketing, sociology, and psychology for hypothesis testing.

What does chi-square tell you?

The Chi-square test informs whether there is a significant association between two categorical variables. Suppose the calculated Chi-square value is above the critical value from the Chi-square distribution.

In that case, it suggests a significant relationship between the variables, rejecting the null hypothesis of no association.

How to calculate chi-square?

To calculate the Chi-square statistic, follow these steps:

1. Create a contingency table of observed frequencies for each category.

2. Calculate expected frequencies for each category under the null hypothesis.

3. Compute the Chi-square statistic using the formula: Χ² = Σ [ (O_i – E_i)² / E_i ], where O_i is the observed frequency and E_i is the expected frequency.

4. Compare the calculated statistic with the critical value from the Chi-square distribution to draw a conclusion.

Print Friendly, PDF & Email

hypothesis test chi square example

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

S.4 chi-square tests, chi-square test of independence section  .

Do you remember how to test the independence of two categorical variables? This test is performed by using a Chi-square test of independence.

Recall that we can summarize two categorical variables within a two-way table, also called an r × c contingency table, where r = number of rows, c = number of columns. Our question of interest is “Are the two variables independent?” This question is set up using the following hypothesis statements:

 \[E=\frac{\text{row total}\times\text{column total}}{\text{sample size}}\]

We will compare the value of the test statistic to the critical value of \(\chi_{\alpha}^2\) with the degree of freedom = ( r - 1) ( c - 1), and reject the null hypothesis if \(\chi^2 \gt \chi_{\alpha}^2\).

Example S.4.1 Section  

Is gender independent of education level? A random sample of 395 people was surveyed and each person was asked to report the highest education level they obtained. The data that resulted from the survey are summarized in the following table:

Question : Are gender and education level dependent at a 5% level of significance? In other words, given the data collected above, is there a relationship between the gender of an individual and the level of education that they have obtained?

Here's the table of expected counts:

So, working this out, \(\chi^2= \dfrac{(60−50.886)^2}{50.886} + \cdots + \dfrac{(57 − 48.132)^2}{48.132} = 8.006\)

The critical value of \(\chi^2\) with 3 degrees of freedom is 7.815. Since 8.006 > 7.815, we reject the null hypothesis and conclude that the education level depends on gender at a 5% level of significance.

JMP | Statistical Discovery.™ From SAS.

Statistics Knowledge Portal

A free online introduction to statistics

The Chi-Square Test

What is a chi-square test.

A Chi-square test is a hypothesis testing method. Two common Chi-square tests involve checking if observed frequencies in one or more categories match expected frequencies.

Is a Chi-square test the same as a χ² test?

Yes, χ is the Greek symbol Chi.

What are my choices?

If you have a single measurement variable, you use a Chi-square goodness of fit test . If you have two measurement variables, you use a Chi-square test of independence . There are other Chi-square tests, but these two are the most common.

Types of Chi-square tests

You use a Chi-square test for hypothesis tests about whether your data is as expected. The basic idea behind the test is to compare the observed values in your data to the expected values that you would see if the null hypothesis is true.

There are two commonly used Chi-square tests: the Chi-square goodness of fit test and the Chi-square test of independence . Both tests involve variables that divide your data into categories. As a result, people can be confused about which test to use. The table below compares the two tests.

Visit the individual pages for each type of Chi-square test to see examples along with details on assumptions and calculations.

Table 1: Choosing a Chi-square test

How to perform a chi-square test.

For both the Chi-square goodness of fit test and the Chi-square test of independence , you perform the same analysis steps, listed below. Visit the pages for each type of test to see these steps in action.

  • Define your null and alternative hypotheses before collecting your data.
  • Decide on the alpha value. This involves deciding the risk you are willing to take of drawing the wrong conclusion. For example, suppose you set α=0.05 when testing for independence. Here, you have decided on a 5% risk of concluding the two variables are independent when in reality they are not.
  • Check the data for errors.
  • Check the assumptions for the test. (Visit the pages for each test type for more detail on assumptions.)
  • Perform the test and draw your conclusion.

Both Chi-square tests in the table above involve calculating a test statistic. The basic idea behind the tests is that you compare the actual data values with what would be expected if the null hypothesis is true. The test statistic involves finding the squared difference between actual and expected data values, and dividing that difference by the expected data values. You do this for each data point and add up the values.

Then, you compare the test statistic to a theoretical value from the Chi-square distribution . The theoretical value depends on both the alpha value and the degrees of freedom for your data. Visit the pages for each test type for detailed examples.

  • Math Article
  • Chi Square Test

Chi-Square Test

A chi-squared test  (symbolically represented as  χ 2 ) is basically a data analysis on the basis of observations of a random set of variables. Usually, it is a comparison of two statistical data sets. This test was introduced by Karl Pearson in 1900 for categorical data analysis and distribution . So it was mentioned as Pearson’s chi-squared test .

The chi-square test is used to estimate how likely the observations that are made would be, by considering the assumption of the null hypothesis as true.

A hypothesis is a consideration that a given condition or statement might be true, which we can test afterwards. Chi-squared tests are usually created from a sum of squared falsities or errors over  the sample variance.

Chi-Square Distribution

When we consider, the null speculation is true, the sampling distribution of the test statistic is called as chi-squared distribution . The chi-squared test helps to determine whether there is a notable difference between the normal frequencies and the observed frequencies in one or more classes or categories. It gives the probability of independent variables.

Note: Chi-squared test is applicable only for categorical data, such as men and women falling under the categories of Gender, Age, Height, etc.

Finding P-Value

P stands for probability here. To calculate the p-value, the chi-square test is used in statistics. The different values of p indicates the different hypothesis interpretation, are given below:

  • P≤ 0.05; Hypothesis rejected
  • P>.05; Hypothesis Accepted

Probability is all about chance or risk or uncertainty. It is the possibility of the outcome of the sample or the occurrence of an event. But when we talk about statistics, it is more about how we handle various data using different techniques. It helps to represent complicated data or bulk data in a very easy and understandable way. It describes the collection, analysis, interpretation, presentation, and organization of data. The concept of both probability and statistics is related to the chi-squared test.

Also, read:

The following are the important properties of the chi-square test:

  • Two times the number of degrees of freedom is equal to the variance.
  • The number of degree of freedom is equal to the mean distribution
  • The chi-square distribution curve approaches the normal distribution when the degree of freedom increases.

The chi-squared test is done to check if there is any difference between the observed value and expected value. The formula for chi-square can be written as;

Chi-square Test Formula

χ 2  = ∑(O i – E i ) 2 /E i

where O i is the observed value and E i is the expected value.

Chi-Square Test of Independence

The chi-square test of independence also known as the chi-square test of association which is used to determine the association between the categorical variables. It is considered as a non-parametric test . It is mostly used to test statistical independence.

The chi-square test of independence is not appropriate when the categorical variables represent the pre-test and post-test observations. For this test, the data must meet the following requirements:

  • Two categorical variables
  • Relatively large sample size
  • Categories of variables (two or more)
  • Independence of observations

Example of Categorical Data

Let us take an example of a categorical data where there is a society of 1000 residents with four neighbourhoods, P, Q, R and S. A random sample of 650 residents of the society is taken whose occupations are doctors, engineers and teachers. The null hypothesis is that each person’s neighbourhood of residency is independent of the person’s professional division. The data are categorised as:

Assume the sample living in neighbourhood P, 150, to estimate what proportion of the whole 1,000 people live in neighbourhood P. In the same way, we take 349/650 to calculate what ratio of the 1,000 are doctors. By the supposition of independence under the hypothesis, we should “expect” the number of doctors in neighbourhood P is;

150 x 349/650  ≈ 80.54

So by  the chi-square test formula for that particular cell in the table, we get;

(Observed – Expected) 2 /Expected Value = (90-80.54) 2 /80.54  ≈ 1.11

Some of the exciting facts about the Chi-square test are given below:

The Chi-square statistic can only be used on numbers. We cannot use them for data in terms of percentages, proportions, means or similar statistical contents. Suppose, if we have 20% of 400 people, we need to convert it to a number, i.e. 80, before running a test statistic.

A chi-square test will give us a p-value. The p-value will tell us whether our test results are significant or not. 

However, to perform a chi-square test and get the p-value, we require two pieces of information:

(1) Degrees of freedom. That’s just the number of categories minus 1.

(2) The alpha level(α). You or the researcher chooses this. The usual alpha level is 0.05 (5%), but you could also have other levels like 0.01 or 0.10.

In elementary statistics, we usually get questions along with the degrees of freedom(DF) and the alpha level. Thus, we don’t usually have to figure out what they are. To get the degrees of freedom, count the categories and subtract 1.

The chi-square distribution table with three probability levels is provided here. The statistic here is used to examine whether distributions of certain variables vary from one another. The categorical variable will produce data in the categories and numerical variables will produce data in numerical form.

The distribution of χ 2 with (r-1)(c-1) degrees of freedom(DF) , is represented in the table given below. Here, r represents the number of rows in the two-way table and c represents the number of columns.

Solved Problem

 A survey on cars had conducted in 2011 and determined that 60% of car owners have  only one car, 28% have two cars, and 12% have three or more. Supposing that you have decided to conduct your own survey and have collected the data below, determine whether your data supports the results of the study.

Use a significance level of 0.05. Also, given that, out of 129 car owners, 73 had one car and 38 had two cars.

Let us state the null and alternative hypotheses.

H 0 : The proportion of car owners with one, two or three cars is 0.60, 0.28 and 0.12 respectively.

H 1 : The proportion of car owners with one, two or three cars does not match the proposed model.

A Chi-Square goodness of fit test is appropriate because we are examining the distribution of a single categorical variable. 

Let’s tabulate the given information and calculate the required values.

Therefore, χ 2  = ∑(O i  – E i ) 2 /E i  = 0.7533

Let’s compare it to the chi-square value for the significance level 0.05. 

The degrees for freedom = 3 – 1 = 2

Using the table, the critical value for a 0.05 significance level with df = 2 is 5.99. 

That means that 95 times out of 100, a survey that agrees with a sample will have a χ 2  value of 5.99 or less. 

The Chi-square statistic is only 0.7533, so we will accept the null hypothesis.

Frequently Asked Questions – FAQs

What is the chi-square test write its formula, how do you calculate chi squared, what is a chi-square test used for, how do you interpret a chi-square test, what is a good chi-square value, leave a comment cancel reply.

Your Mobile number and Email id will not be published. Required fields are marked *

Request OTP on Voice Call

Post My Comment

hypothesis test chi square example

  • Share Share

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

close

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Statistics and probability

Course: statistics and probability   >   unit 14.

  • Chi-square distribution introduction

Pearson's chi square test (goodness of fit)

  • Chi-square statistic for hypothesis testing
  • Chi-square goodness-of-fit example
  • Expected counts in a goodness-of-fit test
  • Conditions for a goodness-of-fit test
  • Test statistic and P-value in a goodness-of-fit test
  • Conclusions in a goodness-of-fit test

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Video transcript

Examples

Chi Square Test

Ai generator.

hypothesis test chi square example

The chi-square test, a cornerstone of statistical analysis, is utilized to examine the independence of two categorical variables, offering a method to assess observed versus expected frequencies in categorical data. This test extends beyond basic algebra and rational numbers , involving computations with square and square roots , which are integral in determining the chi-square statistic. Unlike dealing with integers or continuous rational and irrational numbers directly, this test quantifies how much observed counts deviate from expected counts in categorical data, rooted in the realm of probability and discrete mathematics . Additionally, while it diverges from the least squares method used for continuous data regression, both share a common goal of minimizing deviation to optimize fit between observed and expected models. In statistics , understanding and applying the chi-square test provides crucial insights into data relationships, crucial for robust analytical conclusions in research and real-world applications.

What is Chi Square Test?

Chi-square distribution.

The chi-square distribution is a fundamental probability distribution in statistics, widely used in hypothesis testing and confidence interval estimation for variance. It arises primarily when summing the squares of independent, standard normal variables, and is characterized by its degrees of freedom, which influence its shape. As the degrees of freedom increase, the distribution becomes more symmetric and approaches a normal distribution. This distribution is crucial in constructing the chi-square test for independence and goodness-of-fit tests, helping to determine whether observed frequencies significantly deviate from expected frequencies under a given hypothesis. It is also integral to the analysis of variance (ANOVA) and other statistical procedures that assess the variability among group means.

Finding P-Value

Step 1: understand the p-value.

The p-value represents the probability of observing a test statistic as extreme as, or more extreme than, the value calculated from the sample data, under the null hypothesis. A low p-value (typically less than 0.05) suggests that the observed data is inconsistent with the null hypothesis, leading to its rejection.

Step 2: Calculate the Test Statistic

Depending on the statistical test being used (like t-test, chi-square test, ANOVA, etc.), first calculate the appropriate test statistic based on your data. This involves different formulas depending on the test and the data structure.

Step 3: Determine the Distribution

Identify the distribution that the test statistic follows under the null hypothesis. For example, the test statistic in a chi-square test follows a chi-square distribution, while a t-test statistic follows a t-distribution.

Step 4: Find the P-Value

Use the distribution identified in Step 3 to find the probability of obtaining a test statistic as extreme as the one you calculated. This can be done using statistical software, tables, or online calculators. You will compare your test statistic to the critical values from the distribution, calculating the area under the curve that lies beyond the test statistic.

Step 5: Interpret the P-Value

  • If the p-value is less than the chosen significance level (usually 0.05) , reject the null hypothesis, suggesting that the effect observed in the data is statistically significant.
  • If the p-value is greater than the significance level , you do not have enough evidence to reject the null hypothesis, and it is assumed that any observed differences could be due to chance.

Practical Example

For a simpler illustration, suppose you’re conducting a two-tailed t-test with a t-statistic of 2.3, and you’re using a significance level of 0.05. You would:

  • Identify that the t-statistic follows a t-distribution with degrees of freedom dependent on your sample size.
  • Using a t-distribution table or software, find the probability that a t-value is at least as extreme as ±2.3.
  • Sum the probabilities of obtaining a t-value of 2.3 or higher and -2.3 or lower. This sum is your p-value.

Properties of Chi-Square

1. non-negativity.

  • The chi-square statistic is always non-negative. This property arises because it is computed as the sum of the squares of standardized differences between observed and expected frequencies.

2. Degrees of Freedom

  • The shape and scale of the chi-square distribution are primarily determined by its degrees of freedom, which in turn depend on the number of categories or variables involved in the analysis. The degrees of freedom for a chi-square test are generally calculated as (𝑟−1)(𝑐−1)( r −1)( c −1) for an 𝑟×𝑐 r × c contingency table.

3. Distribution Shape

  • The chi-square distribution is skewed to the right, especially with fewer degrees of freedom. As the degrees of freedom increase, the distribution becomes more symmetric and starts to resemble a normal distribution.

4. Additivity

  • The chi-square distributions are additive. This means that if two independent chi-square variables are added together, their sum also follows a chi-square distribution, with degrees of freedom equal to the sum of their individual degrees of freedom.

5. Dependency on Sample Size

  • The chi-square statistic is sensitive to sample size. Larger sample sizes tend to give more reliable estimates of the chi-square statistic, reducing the influence of sampling variability. This property emphasizes the need for adequate sample sizes in experiments intending to use chi-square tests for valid inference.

Chi-Square Formula

Chi-Square-Formula

Components of the Formula:

  • χ ² is the chi-square statistic.
  • 𝑂ᵢ​ represents the observed frequency for each category.
  • 𝐸ᵢ​ represents the expected frequency for each category, based on the hypothesis being tested.
  • The summation (∑) is taken over all categories involved in the test.

Chi-Square Test of Independence

The Chi-Square Test of Independence assesses whether two categorical variables are independent, meaning whether the distribution of one variable differs depending on the value of the other variable.

Assumptions

Before conducting the test, certain assumptions must be met:

  • Sample Size : All expected frequencies should be at least 1, and no more than 20% of expected frequencies are less than 5.
  • Independence : Observations must be independent of each other, typically achieved by random sampling.
  • Data Level : Both variables should be categorical (nominal or ordinal).

Example of Categorical Data

Breakdown of the table.

  • Rows : Represent different categories of pet ownership (Owns a Pet, Does Not Own a Pet).
  • Columns : Represent preferences for types of pet food (Organic, Non-Organic).
  • Cells : Show the frequency of respondents in each combination of categories (e.g., 120 people own a pet and prefer organic pet food).

Below is the representation of a chi-square distribution table with three probability levels (commonly used significance levels: 0.05, 0.01, and 0.001) for degrees of freedom up to 50. The degrees of freedom (DF) for a chi-square test in a contingency table are calculated as (r-1)(c-1), where r is the number of rows and c is the number of columns. This table is vital for determining critical values when testing hypotheses involving categorical data.

This table provides critical values for various degrees of freedom and significance levels, which can be used to determine the likelihood of observing a chi-square statistic at least as extreme as the test statistic calculated from your data, under the assumption that the null hypothesis is true.

Example of Chi-Square Test for Independence

The Chi-square test for independence is a statistical test commonly used to determine if there is a significant relationship between two categorical variables in a population. Let’s go through a detailed example to understand how to apply this test.

Imagine a researcher wants to investigate whether gender (male or female) affects the choice of a major (science or humanities) among university students.

Data Collection

The researcher surveys a sample of 300 students and compiles the data into the following contingency table:

  • Null Hypothesis (H₀): There is no relationship between gender and choice of major.
  • Alternative Hypothesis (H₁): There is a relationship between gender and choice of major.

1. Calculate Expected Counts:

  • Under the null hypothesis, if there’s no relationship between gender and major, the expected count for each cell of the table is calculated by the formula:

Eᵢⱼ ​= (Row Total×Column Total)​/Total Observations

For the ‘Male & Science’ cell:

𝐸ₘₐₗₑ, ₛ꜀ᵢₑₙ꜀ₑ = (150×130)300 = 65

Repeat this for each cell.

Compute Chi-Square Statistic

The chi-square statistic is calculated using:

χ ² = ∑( O − E )²​/E

Where 𝑂 is the observed frequency, and 𝐸 is the expected frequency. For each cell:

χ ² = 65(70−65)2​+85(80−85)2​+65(60−65)2​+85(90−85)2​ = 1.615

Determine Significance

With 1 degree of freedom (df = (rows – 1) \times (columns – 1)), check the critical value from the chi-square distribution table at the desired significance level (e.g., 0.05). If 𝜒² calculated is greater than the critical value from the table, reject the null hypothesis.

What does the Chi-Square value indicate?

The Chi-Square value indicates how much the observed frequencies deviate from the expected frequencies under the null hypothesis of independence. A higher Chi-Square value suggests a greater deviation, which may lead to the rejection of the null hypothesis if the value exceeds the critical value from the Chi-Square distribution table for the given degrees of freedom and significance level.

How do you interpret the results of a Chi-Square Test?

To interpret the results of a Chi-Square Test, compare the calculated Chi-Square statistic to the critical value from the Chi-Square distribution table at your chosen significance level (commonly 0.05 or 0.01). If the calculated value is greater than the critical value, reject the null hypothesis, suggesting a significant association between the variables. If it is less, fail to reject the null hypothesis, indicating no significant association.

What are the limitations of the Chi-Square Test?

The Chi-Square Test assumes that the data are from a random sample, observations are independent, and expected frequencies are sufficiently large, typically at least 5 in each cell of the table. When these conditions are not met, the test results may not be valid. Additionally, the test does not provide information about the direction or strength of the association, only its existence.

Twitter

Text prompt

  • Instructive
  • Professional

10 Examples of Public speaking

20 Examples of Gas lighting

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

11.E: The Chi-Square Distribution (Exercises)

  • Last updated
  • Save as PDF
  • Page ID 1150

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

These are homework exercises to accompany the Textmap created for "Introductory Statistics" by OpenStax.

11.1: Introduction

11.2: facts about the chi-square distribution.

Decide whether the following statements are true or false.

As the number of degrees of freedom increases, the graph of the chi-square distribution looks more and more symmetrical.

The standard deviation of the chi-square distribution is twice the mean.

The mean and the median of the chi-square distribution are the same if \(df = 24\).

11:3: Goodness-of-Fit Test

For each problem, use a solution sheet to solve the hypothesis test problem. Go to [link] for the chi-square solution sheet. Round expected frequency to two decimal places.

A six-sided die is rolled 120 times. Fill in the expected frequency column. Then, conduct a hypothesis test to determine if the die is fair. The data in Table are the result of the 120 rolls.

The marital status distribution of the U.S. male population, ages 15 and older, is as shown in Table . Q 11.3.2

Suppose that a random sample of 400 U.S. young adult males, 18 to 24 years old, yielded the following frequency distribution. We are interested in whether this age group of males fits the distribution of the U.S. adult population. Calculate the frequency one would expect when surveying 400 people. Fill in Table , rounding to two decimal places.

  • The data fits the distribution.
  • The data does not fit the distribution.
  • chi-square distribution with \(df = 3\)
  • Check student’s solution.
  • \(\alpha = 0.05\)
  • Decision: Reject null
  • Reason for decision: \(p\text{-value} < \alpha\)
  • Conclusion: Data does not fit the distribution.

Use the following information to answer the next two exercises: The columns in Table contain the Race/Ethnicity of U.S. Public Schools for a recent year, the percentages for the Advanced Placement Examinee Population for that class, and the Overall Student Population. Suppose the right column contains the result of a survey of 1,000 local students from that year who took an AP Exam.

Perform a goodness-of-fit test to determine whether the local results follow the distribution of the U.S. overall student population based on ethnicity.

Perform a goodness-of-fit test to determine whether the local results follow the distribution of U.S. AP examinee population, based on ethnicity.

  • \(H_{0}\): The local results follow the distribution of the U.S. AP examinee population
  • \(H_{0}\): The local results do not follow the distribution of the U.S. AP examinee population
  • chi-square distribution with \(df = 5\)
  • chi-square test statistic = 13.4
  • \(p\text{-value} = 0.0199\)

The City of South Lake Tahoe, CA, has an Asian population of 1,419 people, out of a total population of 23,609. Suppose that a survey of 1,419 self-reported Asians in the Manhattan, NY, area yielded the data in Table. Conduct a goodness-of-fit test to determine if the self-reported sub-groups of Asians in the Manhattan area fit that of the Lake Tahoe area.

Use the following information to answer the next two exercises: UCLA conducted a survey of more than 263,000 college freshmen from 385 colleges in fall 2005. The results of students' expected majors by gender were reported in The Chronicle of Higher Education (2/2/2006) . Suppose a survey of 5,000 graduating females and 5,000 graduating males was done as a follow-up last year to determine what their actual majors were. The results are shown in the tables for Exercise and Exercise. The second column in each table does not add to 100% because of rounding.

Conduct a goodness-of-fit test to determine if the actual college majors of graduating females fit the distribution of their expected majors.

  • \(H_{0}\): The actual college majors of graduating females fit the distribution of their expected majors
  • \(H_{a}\): The actual college majors of graduating females do not fit the distribution of their expected majors
  • \(df = 10\)
  • chi-square distribution with \(df = 10\)
  • \(\text{test statistic} = 11.48\)
  • \(p\text{-value} = 0.3211\)
  • Decision: Do not reject null when \(a = 0.05\) and \(a = 0.01\)
  • Reason for decision: \(p\text{-value} > \alpha\)
  • Conclusion: There is insufficient evidence to conclude that the distribution of actual college majors of graduating females fits the distribution of their expected majors.

Conduct a goodness-of-fit test to determine if the actual college majors of graduating males fit the distribution of their expected majors.

Read the statement and decide whether it is true or false.

In a goodness-of-fit test, the expected values are the values we would expect if the null hypothesis were true.

In general, if the observed values and expected values of a goodness-of-fit test are not close together, then the test statistic can get very large and on a graph will be way out in the right tail.

Use a goodness-of-fit test to determine if high school principals believe that students are absent equally during the week or not.

The test to use to determine if a six-sided die is fair is a goodness-of-fit test.

In a goodness-of fit test, if the p -value is 0.0113, in general, do not reject the null hypothesis.

A sample of 212 commercial businesses was surveyed for recycling one commodity; a commodity here means any one type of recyclable material such as plastic or aluminum. Table shows the business categories in the survey, the sample size of each category, and the number of businesses in each category that recycle one commodity. Based on the study, on average half of the businesses were expected to be recycling one commodity. As a result, the last column shows the expected number of businesses in each category that recycle one commodity. At the 5% significance level, perform a hypothesis test to determine if the observed number of businesses that recycle one commodity follows the uniform distribution of the expected values.

Table contains information from a survey among 499 participants classified according to their age groups. The second column shows the percentage of obese people per age class among the study participants. The last column comes from a different study at the national level that shows the corresponding percentages of obese people in the same age classes in the USA. Perform a hypothesis test at the 5% significance level to determine whether the survey participants are a representative sample of the USA obese population.

  • \(H_{0}\): Surveyed obese fit the distribution of expected obese
  • \(H_{a}\): Surveyed obese do not fit the distribution of expected obese
  • chi-square distribution with \(df = 4\)
  • \(\text{test statistic} = 54.01\)
  • \(p\text{-value} = 0\)
  • \(\alpha: 0.05\)
  • Decision: Reject the null hypothesis.
  • Conclusion: At the 5% level of significance, from the data, there is sufficient evidence to conclude that the surveyed obese do not fit the distribution of expected obese.

11.4: Test of Independence

For each problem, use a solution sheet to solve the hypothesis test problem. Go to Appendix E for the chi-square solution sheet. Round expected frequency to two decimal places.

A recent debate about where in the United States skiers believe the skiing is best prompted the following survey. Test to see if the best ski area is independent of the level of the skier.

Car manufacturers are interested in whether there is a relationship between the size of car an individual drives and the number of people in the driver’s family (that is, whether car size and family size are independent). To test this, suppose that 800 car owners were randomly surveyed with the results in Table . Conduct a test of independence.

  • \(H_{0}\): Car size is independent of family size.
  • \(H_{a}\): Car size is dependent on family size.
  • chi-square distribution with \(df = 9\)
  • \(\text{test statistic} = 15.8284\)
  • \(p\text{-value} = 0.0706\)
  • Decision: Do not reject the null hypothesis.
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that car size and family size are dependent.

College students may be interested in whether or not their majors have any effect on starting salaries after graduation. Suppose that 300 recent graduates were surveyed as to their majors in college and their starting salaries after graduation. Table shows the data. Conduct a test of independence.

Some travel agents claim that honeymoon hot spots vary according to age of the bride. Suppose that 280 recent brides were interviewed as to where they spent their honeymoons. The information is given in Table. Conduct a test of independence.

  • \(H_{0}\): Honeymoon locations are independent of bride’s age.
  • \(H_{a}\): Honeymoon locations are dependent on bride’s age.
  • \(\text{test statistic} = 15.7027\)
  • \(p\text{-value} = 0.0734\)
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that honeymoon location and bride age are dependent.

A manager of a sports club keeps information concerning the main sport in which members participate and their ages. To test whether there is a relationship between the age of a member and his or her choice of sport, 643 members of the sports club are randomly selected. Conduct a test of independence.

A major food manufacturer is concerned that the sales for its skinny french fries have been decreasing. As a part of a feasibility study, the company conducts research into the types of fries sold across the country to determine if the type of fries sold is independent of the area of the country. The results of the study are shown in Table. Conduct a test of independence.

  • \(H_{0}\): The types of fries sold are independent of the location.
  • \(H_{a}\): The types of fries sold are dependent on the location.
  • chi-square distribution with \(df = 6\)
  • \(\text{test statistic} =18.8369\)
  • \(p\text{-value} = 0.0044\)
  • Conclusion: At the 5% significance level, There is sufficient evidence that types of fries and location are dependent.

According to Dan Lenard, an independent insurance agent in the Buffalo, N.Y. area, the following is a breakdown of the amount of life insurance purchased by males in the following age groups. He is interested in whether the age of the male and the amount of life insurance purchased are independent events. Conduct a test for independence.

Suppose that 600 thirty-year-olds were surveyed to determine whether or not there is a relationship between the level of education an individual has and salary. Conduct a test of independence.

  • \(H_{0}\): Salary is independent of level of education.
  • \(H_{a}\): Salary is dependent on level of education.
  • \(df = 12\)
  • chi-square distribution with \(df = 12\)
  • \(\text{test statistic} = 255.7704\)

The number of degrees of freedom for a test of independence is equal to the sample size minus one.

The test for independence uses tables of observed and expected data values.

The test to use when determining if the college or university a student chooses to attend is related to his or her socioeconomic status is a test for independence.

In a test of independence, the expected number is equal to the row total multiplied by the column total divided by the total surveyed.

An ice cream maker performs a nationwide survey about favorite flavors of ice cream in different geographic areas of the U.S. Based on Table, do the numbers suggest that geographic location is independent of favorite ice cream flavors? Test at the 5% significance level.

Table provides a recent survey of the youngest online entrepreneurs whose net worth is estimated at one million dollars or more. Their ages range from 17 to 30. Each cell in the table illustrates the number of entrepreneurs who correspond to the specific age group and their net worth. Are the ages and net worth independent? Perform a test of independence at the 5% significance level.

  • \(H_{0}\): Age is independent of the youngest online entrepreneurs’ net worth.
  • \(H_{5}\): Age is dependent on the net worth of the youngest online entrepreneurs.
  • chi-square distribution with \(df = 2\)
  • \(\text{test statistic} = 1.76\)
  • \(p\text{-value} = 0.4144\)
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that age and net worth for the youngest online entrepreneurs are dependent.

A 2013 poll in California surveyed people about taxing sugar-sweetened beverages. The results are presented in Table, and are classified by ethnic group and response type. Are the poll responses independent of the participants’ ethnic group? Conduct a test of independence at the 5% significance level.

11.5: Test for Homogeneity

For each word problem, use a solution sheet to solve the hypothesis test problem. Go to [link] for the chi-square solution sheet. Round expected frequency to two decimal places.

A psychologist is interested in testing whether there is a difference in the distribution of personality types for business majors and social science majors. The results of the study are shown in Table. Conduct a test of homogeneity. Test at a 5% level of significance.

  • \(H_{0}\): The distribution for personality types is the same for both majors
  • \(H_{a}\): The distribution for personality types is not the same for both majors
  • chi-square with \(df = 4\)
  • \(\text{test statistic} = 3.01\)
  • \(p\text{-value} = 0.5568\)
  • Conclusion: There is insufficient evidence to conclude that the distribution of personality types is different for business and social science majors.

Do men and women select different breakfasts? The breakfasts ordered by randomly selected men and women at a popular breakfast place is shown in Table . Conduct a test for homogeneity at a 5% level of significance.

A fisherman is interested in whether the distribution of fish caught in Green Valley Lake is the same as the distribution of fish caught in Echo Lake. Of the 191 randomly selected fish caught in Green Valley Lake, 105 were rainbow trout, 27 were other trout, 35 were bass, and 24 were catfish. Of the 293 randomly selected fish caught in Echo Lake, 115 were rainbow trout, 58 were other trout, 67 were bass, and 53 were catfish. Perform a test for homogeneity at a 5% level of significance.

  • \(H_{0}\): The distribution for fish caught is the same in Green Valley Lake and in Echo Lake.
  • \(H_{a}\): The distribution for fish caught is not the same in Green Valley Lake and in Echo Lake.
  • chi-square with \(df = 3\)
  • \(\text{test statistic} = 11.75\)
  • \(p\text{-value} = 0.0083\)
  • Conclusion: There is evidence to conclude that the distribution of fish caught is different in Green Valley Lake and in Echo Lake

In 2007, the United States had 1.5 million homeschooled students, according to the U.S. National Center for Education Statistics. In Table you can see that parents decide to homeschool their children for different reasons, and some reasons are ranked by parents as more important than others. According to the survey results shown in the table, is the distribution of applicable reasons the same as the distribution of the most important reason? Provide your assessment at the 5% significance level. Did you expect the result you obtained?

When looking at energy consumption, we are often interested in detecting trends over time and how they correlate among different countries. The information in Table shows the average energy use (in units of kg of oil equivalent per capita) in the USA and the joint European Union countries (EU) for the six-year period 2005 to 2010. Do the energy use values in these two areas come from the same distribution? Perform the analysis at the 5% significance level.

  • \(H_{0}\): The distribution of average energy use in the USA is the same as in Europe between 2005 and 2010.
  • \(H_{a}\): The distribution of average energy use in the USA is not the same as in Europe between 2005 and 2010.
  • \(\text{test statistic} = 2.7434\)
  • \(p\text{-value} = 0.7395\)
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the average energy use values in the US and EU are not derived from different distributions for the period from 2005 to 2010.

The Insurance Institute for Highway Safety collects safety information about all types of cars every year, and publishes a report of Top Safety Picks among all cars, makes, and models. Table presents the number of Top Safety Picks in six car categories for the two years 2009 and 2013. Analyze the table data to conclude whether the distribution of cars that earned the Top Safety Picks safety award has remained the same between 2009 and 2013. Derive your results at the 5% significance level.

11.6: Comparison of the Chi-Square Tests

Is there a difference between the distribution of community college statistics students and the distribution of university statistics students in what technology they use on their homework? Of some randomly selected community college students, 43 used a computer, 102 used a calculator with built in statistics functions, and 65 used a table from the textbook. Of some randomly selected university students, 28 used a computer, 33 used a calculator with built in statistics functions, and 40 used a table from the textbook. Conduct an appropriate hypothesis test using a 0.05 level of significance.

  • \(H_{0}\): The distribution for technology use is the same for community college students and university students.
  • \(H_{a}\): The distribution for technology use is not the same for community college students and university students.
  • chi-square with \(df = 2\)
  • \(\text{test statistic} = 7.05\)
  • \(p\text{-value} = 0.0294\)
  • Conclusion: There is sufficient evidence to conclude that the distribution of technology use for statistics homework is not the same for statistics students at community colleges and at universities.

If \(df = 2\), the chi-square distribution has a shape that reminds us of the exponential.

11.7: Test of a Single Variance

Use the following information to answer the next twelve exercises: Suppose an airline claims that its flights are consistently on time with an average delay of at most 15 minutes. It claims that the average delay is so consistent that the variance is no more than 150 minutes. Doubting the consistency part of the claim, a disgruntled traveler calculates the delays for his next 25 flights. The average delay for those 25 flights is 22 minutes with a standard deviation of 15 minutes.

Is the traveler disputing the claim about the average or about the variance?

A sample standard deviation of 15 minutes is the same as a sample variance of __________ minutes.

Is this a right-tailed, left-tailed, or two-tailed test?

\(H_{0}\): __________

\(H_{0}: \sigma^{2} \leq 150\)

\(df =\) ________

chi-square test statistic = ________

\(p\text{-value} =\) ________

Graph the situation. Label and scale the horizontal axis. Mark the mean and test statistic. Shade the \(p\text{-value}\).

Let \(\alpha = 0.05\)

Decision: ________

Conclusion (write out in a complete sentence.): ________

How did you know to test the variance instead of the mean?

The claim is that the variance is no more than 150 minutes.

If an additional test were done on the claim of the average delay, which distribution would you use?

If an additional test were done on the claim of the average delay, but 45 flights were surveyed, which distribution would you use?

a Student's \(t\)- or normal distribution

A plant manager is concerned her equipment may need recalibrating. It seems that the actual weight of the 15 oz. cereal boxes it fills has been fluctuating. The standard deviation should be at most 0.5 oz. In order to determine if the machine needs to be recalibrated, 84 randomly selected boxes of cereal from the next day’s production were weighed. The standard deviation of the 84 boxes was 0.54. Does the machine need to be recalibrated?

Consumers may be interested in whether the cost of a particular calculator varies from store to store. Based on surveying 43 stores, which yielded a sample mean of $84 and a sample standard deviation of $12, test the claim that the standard deviation is greater than $15.

  • \(H_{0}: \sigma = 15\)
  • \(H_{a}: \sigma > 15\)
  • \(df = 42\)
  • chi-square with \(df = 42\)
  • test statistic = 26.88
  • \(p\text{-value} = 0.9663\)
  • \(alpha = 0.05\)
  • Decision: Do not reject null hypothesis.
  • Conclusion: There is insufficient evidence to conclude that the standard deviation is greater than 15.

Isabella, an accomplished Bay to Breakers runner, claims that the standard deviation for her time to run the 7.5 mile race is at most three minutes. To test her claim, Rupinder looks up five of her race times. They are 55 minutes, 61 minutes, 58 minutes, 63 minutes, and 57 minutes.

Airline companies are interested in the consistency of the number of babies on each flight, so that they have adequate safety equipment. They are also interested in the variation of the number of babies. Suppose that an airline executive believes the average number of babies on flights is six with a variance of nine at most. The airline conducts a survey. The results of the 18 flights surveyed give a sample average of 6.4 with a sample standard deviation of 3.9. Conduct a hypothesis test of the airline executive’s belief.

  • \(H_{0}: \sigma \leq 3\)
  • \(H_{a}: \sigma > 3\)
  • \(df = 17\)
  • chi-square distribution with \(df = 17\)
  • test statistic = 28.73
  • \(p\text{-value} = 0.0371\)
  • Conclusion: There is sufficient evidence to conclude that the standard deviation is greater than three.

The number of births per woman in China is 1.6 down from 5.91 in 1966. This fertility rate has been attributed to the law passed in 1979 restricting births to one per woman. Suppose that a group of students studied whether or not the standard deviation of births per woman was greater than 0.75. They asked 50 women across China the number of births they had had. The results are shown in Table. Does the students’ survey indicate that the standard deviation is greater than 0.75?

According to an avid aquarist, the average number of fish in a 20-gallon tank is 10, with a standard deviation of two. His friend, also an aquarist, does not believe that the standard deviation is two. She counts the number of fish in 15 other 20-gallon tanks. Based on the results that follow, do you think that the standard deviation is different from two? Data: 11; 10; 9; 10; 10; 11; 11; 10; 12; 9; 7; 9; 11; 10; 11

  • \(H_{0}: \sigma = 2\)
  • \(H_{a}: \sigma \neq 2\)
  • \(df = 14\)
  • chi-square distribution with \(df = 14\)
  • chi-square test statistic = 5.2094
  • \(p\text{-value} = 0.0346\)
  • Decision: Reject the null hypothesis
  • Conclusion: There is sufficient evidence to conclude that the standard deviation is different than 2.

The manager of "Frenchies" is concerned that patrons are not consistently receiving the same amount of French fries with each order. The chef claims that the standard deviation for a ten-ounce order of fries is at most 1.5 oz., but the manager thinks that it may be higher. He randomly weighs 49 orders of fries, which yields a mean of 11 oz. and a standard deviation of two oz.

You want to buy a specific computer. A sales representative of the manufacturer claims that retail stores sell this computer at an average price of $1,249 with a very narrow standard deviation of $25. You find a website that has a price comparison for the same computer at a series of stores as follows: $1,299; $1,229.99; $1,193.08; $1,279; $1,224.95; $1,229.99; $1,269.95; $1,249. Can you argue that pricing has a larger standard deviation than claimed by the manufacturer? Use the 5% significance level. As a potential buyer, what would be the practical conclusion from your analysis?

  • \(H_{0}: \sigma = 25^{2}\)
  • \(H_{a}: \sigma > 25^{2}\)
  • \(df = n - 1 = 7\)
  • test statistic: \(\chi^{2} = \chi^{2}_{7} = \frac{(n-1)s^{2}}{25^{2}} = \frac{(8-1)(34.29)^{2}}{25^{2}} = 13.169\)
  • \(p\text{-value}: P(\chi^{2}_{7} > 13.169) = 1- P(\chi^{2}_{7} \leq 13.169) = 0.0681\)
  • Decision: Do not reject the null hypothesis
  • Conclusion: At the 5% level, there is insufficient evidence to conclude that the variance is more than 625.

A company packages apples by weight. One of the weight grades is Class A apples. Class A apples have a mean weight of 150 g, and there is a maximum allowed weight tolerance of 5% above or below the mean for apples in the same consumer package. A batch of apples is selected to be included in a Class A apple package. Given the following apple weights of the batch, does the fruit comply with the Class A grade weight tolerance requirements. Conduct an appropriate hypothesis test.

  • at the 5% significance level
  • at the 1% significance level

Weights in selected apple batch (in grams): 158; 167; 149; 169; 164; 139; 154; 150; 157; 171; 152; 161; 141; 166; 172;

11.8: Lab 1: Chi-Square Goodness-of-Fit

11.9: lab 2: chi-square test of independence.

  • School Guide
  • Mathematics
  • Number System and Arithmetic
  • Trigonometry
  • Probability
  • Mensuration
  • Maths Formulas
  • Class 8 Maths Notes
  • Class 9 Maths Notes
  • Class 10 Maths Notes
  • Class 11 Maths Notes
  • Class 12 Maths Notes
  • Chi-Square Test in R
  • Application of Chi Square Test
  • Sign Test in R
  • Paired Sample T-Test in Excel
  • Area of Square
  • Chi-Square Distribution in R
  • Python - Pearson's Chi-Square Test
  • Chi-square test in Machine Learning
  • TEST | STAGE | Question 2
  • Testing | Question 1
  • sympy.stats.ChiSquared() in python
  • Alpha and Beta test
  • F-Test in Statistics
  • Test | Quiz | Question 1
  • Test | quiz | Question 1
  • Mood's Median Test
  • | | Question 4

Chi-Square Test

Chi-squared test indicates that there is a relationship between two entities. Handling data often involves testing hypotheses to extract useful information. In categorical analysis, chi-square tests are used to determine if observed patterns are likely to be purely random. The present manuscript expands upon chi-squared exam concepts, definitions, and procedures for doing them correctly.

What Is a Chi-Square Test?

Chi-squared test, or χ² test, indicates that there is a relationship between two entities. For example, it can be demonstrated when we look out for people’s favourite colours and their preference for ice cream. The test is instrumental in telling whether these two variables are associated with each other. For instance, it is possible that individuals who prefer the colour blue also tend to be in favour of chocolate ice cream. This test checks whether or not observed data fits those that would be expected assuming that association is absent at all, where there is a huge deviation.

Chi-Square-Test

When you toss a coin, you will expect to see heads or tails appearing in almost equal measure. For instance, in case you toss it several times and get many heads, then through the chi-square test we can conclude that the probability is less likely to be due to mere chance. Essentially, the chi-square test tackles two forms of figures which include observed frequencies (what you see happening) versus predicted frequencies(i.e., what should have occurred by chance). For simplicity, let us use an example of tossing coins where one would assume that getting either a head or a tail should occur on average fifty per cent of times each. It assists you with checking whether how the situation is playing out is a direct result of something genuine or simply irregular karma.

Why Chi-Square Tests Matter

Chi-square tests are important in various fields of study such as marketing, biology, medicine or even social sciences; that is why they are extremely valuable:

  • Revealing Associations: Chi-square tests help researchers identify significant relationships between different categories, aiding in the understanding of causation and prediction.
  • Validating Assumptions: Chi-square tests check if your observed data matches what you expected. This helps you know if your ideas are on track or if you need to reconsider them.
  • Data-Driven Decisions: Chi-square tests validate our beliefs based on empirical evidence and boost confidence in our inferences.

Formula For Chi-Square Test

(O i – E i )² / E i = χ²

Symbols are broken down as follows:

  • Σ (sigma): The symbol means sum, so each cell of your contingency table must be computed.
  • O i : This shorthand captures the idea of actual number of observations in a given cell of a contingency table, or what was actually counted.
  • E i : The number of times you would expect to see a particular result under conditions where we assume the hypothesis of no association (null hypothesis) is called as the expected frequency i.e. E i .
  • (O i – E i ): The difference between the expected and actual frequencies is computed in this section of the formula.

Steps for Chi-Square Test

Various steps for chi-square test are added below:

Step 1: Define Hypothesis

  • Null Hypothesis (H₀): The relationship between categorical variables is determined by the use of statistical analysis. This means the researcher assumes that there is no relationship between the two variables under study no matter the differences or patterns identified are as a result of random chance. Observing this hypothesis helps us protect our analysis from possible prejudices hence ensuring it is just.
  • Alternative Hypothesis (H₁): The hypothesis suggests that there is a relation between the two categorical independent variables which are under study, therefore showing that there is an actual relationship instead of mere coincidence.

Step 2: Gather and Organize Data

Gather Information about the Two Category Variables:

Before performing a chi-square test, you should have on hand information about two categorical variables you wish to observe. As an example, in case one wishes look into how sex influences which type of ice-cream a person will choose- it would mean knowing the specific choice they would go for whether it is chocolate or strawberry among others besides their gender which implies both pieces of data have been collected already.

  • Before conducting a chi-square test, it is necessary to get data on two categorical variables you want to analyze. For instance, if you are interested in exploring the relationship between gender and preferred ice cream flavors, then you must collect details on people’s sex (male or female) and their best flavors (e.g., chocolate, vanilla, strawberry).
  • Once this information is collected, it can be inserted into a contingency table.
  • When one is investigating the use of two related variables, it is necessary to use a contingency table to capture all combinations they can possibly be combined in. In this table, the values of one variable show up in the columns across, while values of another variable show up in rows. For instance, one can use it to determine how many females liked diet coke/vanilla flavored ice cream.

The hypothesis is that men prefer vanilla while women prefer chocolate. So we need to record how many have chosen vanilla among all male respondents versus the number who chose chocolate out of all female respondents.

Here’s an example of what a contingency table might look like:

In this table:

  • Table contains two dimensions which are gender and ice cream flavors. The row headings are male and female categories respectively whereas column headings represent chocolate, vanilla and strawberry flavors. Each cell contains numerical counts for every combination of category. Conduct a chi-square test on this table to examine association between these two categorical variables.

Step 3: Calculate Expected Frequencies

  • Get Predicted Frequency: In any specific cell, the expected frequency can be described as the number of occurrences which are expected if the two variables were independent.
  • Expected Frequency Calculation : To compute the anticipated frequency of individual cells, one must use a method of comparison. This involves multiplying the sums of rows and columns in proportion, then dividing by the total number of observations in a table.

Step 4: Perform Chi-Square Test

Use Chi-Square Formula:

χ² = Σ (O i – E i )² / E i

Step 5: Determine Degrees of Freedom (df)

df = (number of rows – 1) × (number of columns – 1)

Step 6: Find p-value

  • One can use a chi-squar table to get the p-value for a particular chi-square statistic (χ²) with certain degrees of freedom (df) which was calculated. This table has chances of various values of the chi-square statistic in different degrees of freedom.
  • If null hypothesis is correct then chi-square with its validity will be observed as p value. If it is assumed there is no correlation between the variables then the probability of this data set occurring given what we have seen becomes cleare

Step 7: Interpret Results

  • If the p-value is less than a certain significance level (e.g., 0.05) then we reject the null hypothesis, which is commonly denoted by α. Thus it means that category variables highly correlate each other.
  • When a p-value is above α it implies that we cannot reject the null hypothesis hence there is insufficient evidence for establishing the relationship between these variables.

Addressing Assumptions and Considerations

  • Chi-square tests suppose that the observations are independent from one another; they are distinct.
  • Each cell in the table should have a minimum of five values in it for better results. Otherwise, think about the Fisher’s exact test as an alternative measure if a table cell has less than five numbers in it.
  • Chi-square tests do not indicate a causal relationship but they identify association between variables.

What are Categorical Variables?

  • Categorical variables are like sorting things into different groups. But instead of using numbers, we’re talking about categories or labels. For example, colors, types of fruit, or types of cars are all categorical variables.
  • They’re termed as “categorical” simply because bit by bit they segment things like “red,” “green” or “blue” into separate clusters. Unlike height or weight whose measurements are contiguous, categorical data has definite options without numerical order between them. That is why if you ask whether someone prefers apples to oranges then it means that the person is discussing categorical data

Characteristics of Categorical Variables

  • Distinct Groups : Categorical variables put things into different groups that don’t overlap. For example, when we talk about hair color, someone can be a redhead, black-haired, blonde, or brunette. Each person falls into just one of these groups.
  • Non-Numerical :There is no hierarchy in categorical terms for these are just names and not actions hence its futility to compare like blondes are better than brunettes based on it; referring blondes as bad women due to their hair types will be unfair as it may not make any sense referring such attributions with regards to colours either feminist perspective could be used but the simplest explanation remains that ’they are merely dissimilar.
  • Limited Options : Categorical variables are characterized by a fixed number of possibilities. One may have such choices as red, blonde, brown, black hair color. The number of categories may fluctuate, but they all remain distinct and bounded in scope.

Goodness-Of-Fit

A goodness-of-fit test is used to determine whether or not a model or hypothesis being utilized is consistent with collected data type. Suppose you were to come up with a hypothesis such as: ‘It is likely that humans who live in urban areas are taller than those from rural areas’. After collecting data on the heights of people and comparing it with your hypothesis’ prediction, if there is close agreement between the two then one has grounds for believing that these predictions are correct. But if such agreement does not exist, then perhaps one has to rethink on his/her hypothesis. Thus, the goodness of fit test helps us.

Key Aspects of a Goodness-of-Fit Test

1. Purpose : The aim is to check if a guessed distribution fits well with the data we have.

2. Data Requirements: It can be used with both continuous and categorical data, among other forms of data.

3. Common Applications:

  • The aim of this procedure is to examine if a set of figures is taken from a standard random distribution.
  • In addition to this, it will juxtapose the actual occurrence of stuff with what we expect more times than not – a good example is making comparisons using chi-square test.
  • It evaluates the degree of correspondence between a series of points displayed as dots on a graph, with one straight line or curved shape – this happens during regression analysis.

4. Benefits:

  • Gives us a way to check if our ideas match up with real data.
  • Helps us spot any weird or unusual data that might cause problems.

5. Limitations:

  • Various tests are more effective in various situations.
  • These results can be altered by the test type and volume of data points as well.

Types of Goodness-of-Fit Tests

  • Chi-Square Test: It is mainly used for categorical data and helps in comparing the observed frequencies of classes with their expected frequencies based on a theoretical model.
  • Kolmogorov-Smirnov Test: Anderson-Darling test checks how things are distributed (cumulatively) against an expected model’s distribution without having any assumptions about that model’s form. It applies to discrete as well as continuous data types. You can think of it as a way to tell if a cake recipe was followed or not from how the cake spreads out as it cooks. This makes it important in social science research, as we often lack grounded informations on which alternative hypotheses may be preferred or disconfirmed.
  • Anderson-Darling Test: A nonparametric test that is specifically concerned with the weighted absolute differences between cumulative observed and calculated distributions. When we are talking about the discrepancies in the tails of distributions, it usually is more sensitive than Kolmogorov-Smirnov’s test. It examines the extent of these disparities as renewed emphasis on tails. This is much more efficient at finding outages in the tails punansa suuntaajiin.

Solved Examples on Chi-Square Test

Example 1: A study investigates the relationship between eye color (blue, brown, green) and hair color (blonde, brunette, Redhead) . The following data is collected:

Calculate the chi-square value for each cell in the contingency table using the formula χ² = (O i – E i )² / E i For instance, consider someone with brown hair and blue eyes: χ² = (15 – 28.1)² / 28.1 ≈ 6.07. To complete the total chi-square statistic, find each cell’s chi-squared value and sum them up across all the nine cells in the table. Degrees of Freedom (df): df = (number of rows – 1) × (number of columns – 1) df = (3 – 1) × (3 – 1) df = 2 × 2 = 4 Finding p-value: You may reference a chi-square distribution table to get an estimated chi-square stat of (χ²) using the appropriate degrees of freedom. Look for the closest value and its corresponding p-value since most tables do not show precise numbers. If your Chi-square value was 20.5, you would observe that the nearest number in the table for df = 4 is 14.88 with a p-value in 0.005; an illustration is. Interpreting Results: Selecting a level of significance (α = 0.05 is common)or than if the null hypothesis holds, the probability of either rejecting it at all is limited (Type I error). Compare the alpha value and p-value. When the p-value is less than the significance level, which in this case is written as p-value < 0.05, we can reject the null hypothesis. There is sufficient evidence to say that hair and eye color are related in one direction according to statistical terms. If the p-value is greater than the significance level it means that we cannot reject the null hypothesis therefore p-value > 0.05. Based on the data at hand, we cannot say that there is a statistically significant correlation between eye and hair colors.

Example 2: 100 flips of a coin are performed. The coin is fair, with an equal chance of heads and tails, according to the null hypothesis. 55 heads and 45 tails are the observed findings.

Let’s imagine a coin. this coin has two sides, one which has tails and the other that has heads on them, when flipping this coin there is a 50/50 chance of obtaining either outcome. This is why most of us would like characteristic information about it because then they predict the result based on their prior knowledge or experiences even before actually doing so- such things include whether the person who tossing has been motivated enough as well as what he/she hopes will happen next if head or tail shows up. However, there are times when people make different decisions in a hurry without thinking about future consequences and that could be possible when dealing with rare coin. Afterwards, the anticipated values will be juxtaposed with the ones from making several flips at the dice case. Dissimilar results from those that would be attributable to mere chance may perhaps indicate that this might otherwise.

FAQs on Chi-Square Test

What is a chi-square test used for.

Chi-square test is a statistical test used to compare observed results with expected results.

What is p-value in a chi-square test?

P-value is the area under the density curve of this chi-square distribution to the right of the value of the test statistic.

What are limitations of chi-square tests?

Chi-square tests can only be applied with categorical variables. They need a large enough sample to get accurate results. If the cell numbers are below 5, findings can be unreliable. Independent observation is presumed for chi-square tests. Chi-square tests do not show how strong the association is or in what direction it goes. When dealing with relationships involving continuous variables, they find it unsuitable for themselves, so they opt out of it instead.

What if expected frequencies are low?

If you have a small group or expect only a few things to happen, think about using Fisher’s exact test.

How to choose the appropriate level of significance (α)?

Choice of α is based on the tradeoff between minimizing Type I error (rejecting a true null hypothesis) and Type II error (failing to reject a false null hypothesis). Here are some rules of thumb: Typical choices include α = 0.05 (5%) or α = 0.01 (1%). The consequence of having a lower α is that researchers need a stronger statistical signal in order to reject the null hypothesis, which makes it more serious.

Please Login to comment...

Similar reads.

  • Math-Statistics
  • School Learning

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

Statology

Statistics Made Easy

How to Perform a Chi-Square Test by Hand (Step-by-Step)

A Chi-Square goodness of fit test is used to determine whether or not a categorical variable follows a hypothesized distribution.

The following step-by-step example shows how to perform a Chi-Square goodness of fit test by hand.

Chi-Square Goodness of Fit Test By Hand

Suppose we believe that a certain dice is fair. In other words, we believe the dice is equally likely to land on a 1, 2, 3, 4, 5, or 6 on a given roll.

To test this, we we roll it 60 times and record the number that it lands on each time. The results are as follows:

  • 1 : 8 times
  • 2 : 12 times
  • 3 : 18 times
  • 4 : 9 times
  • 5 : 7 times
  • 6 : 6 times

Use the following steps to perform a Chi-Square goodness of fit test to determine if the dice is fair.

Step 1: Define the Null and Alternative Hypotheses

  • H 0 (null): The dice is equally likely to land on each number.
  • H 1 (alternative) : The dice is not equally likely to land on each number.

Step 2: Calculate the Observed and Expected Frequencies

Next, let’s create a table of observed and expected frequencies for each number on the dice:

hypothesis test chi square example

Note : If we believe the dice is fair, this means we expect it to land on each number an equal amount of times – in this case, 10 times each. 

Step 3: Calculate the Test Statistic

The Chi-Square test statistic, X 2 , is calculated as:

  • X 2 = Σ(O-E) 2 / E

The following table shows how to calculate this test statistic:

hypothesis test chi square example

In this case, X 2 turns out to be 9.8 .

Step 4: Find the Critical Value

Next, we need to find the critical value in the Chi-Square distribution table that corresponds to α = .05 and df = (#categories – 1).

In this case, there are 6 categories, so we will use df = 6 – 1 = 5 .

We can see that the critical value is 11.07 .

hypothesis test chi square example

Step 5: Reject or Fail to Reject the Null Hypothesis

Since our test statistic is less than the critical value, we fail to reject the null hypothesis. This means we do not have sufficient evidence to say that the dice is unfair.

Additional Resources

The following resources offer additional information on the Chi-Square goodness of fit test:

Introduction to the Chi-Square Goodness of Fit Test How to Perform a Chi-Square Goodness of Fit Test in R Chi-Square Goodness of Fit Test Calculator

Featured Posts

Statistics Cheat Sheets to Get Before Your Job Interview

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

I have read and agree to the terms & conditions

IMAGES

  1. Chi Square Test

    hypothesis test chi square example

  2. PPT

    hypothesis test chi square example

  3. Chi Square Test

    hypothesis test chi square example

  4. Chi-Square Test of Independence

    hypothesis test chi square example

  5. Chi-square test Question Example

    hypothesis test chi square example

  6. 02 Complete Chi Square Hypothesis Test Example 1

    hypothesis test chi square example

VIDEO

  1. Chi Square Hypothesis Testing

  2. Chi-Squared Test

  3. Chi-square test(χ2-test) of Goodness of fit for Normal Distribution

  4. Hypothesis Testing and Chi-Square Test

  5. Chi Square Test Part 1 (Testing of Hypothesis or Testing of Significance)

  6. Chi Square Test in simplest explanation

COMMENTS

  1. Chi-Square (Χ²) Tests

    Example: Chi-square test of independence. Null hypothesis (H 0): The proportion of people who are left-handed is the same for Americans and Canadians. ... You should reject the null hypothesis if the chi-square value is greater than the critical value. If you reject the null hypothesis, you can conclude that your data are significantly ...

  2. 4 Examples of Using Chi-Square Tests in Real Life

    1. The Chi-Square Goodness of Fit Test - Used to determine whether or not a categorical variable follows a hypothesized distribution. 2. The Chi-Square Test of Independence - Used to determine whether or not there is a significant association between two categorical variables. In this article, we share several examples of how each of these ...

  3. Hypothesis Testing

    We then determine the appropriate test statistic for the hypothesis test. The formula for the test statistic is given below. Test Statistic for Testing H0: p1 = p 10 , p2 = p 20 , ..., pk = p k0. We find the critical value in a table of probabilities for the chi-square distribution with degrees of freedom (df) = k-1.

  4. Chi-Square Test of Independence and an Example

    The chi-squared test of independence (or association) and the two-sample proportions test are related. The main difference is that the chi-squared test is more general while the 2-sample proportions test is more specific. And, it happens that the proportions test it more targeted at specifically the type of data you have.

  5. What is a Chi-Square Test? Formula, Examples & Uses

    The chi-square test is a statistical test used to analyze categorical data and assess the independence or association between variables. There are two main types of chi-square tests: a) Chi-square test of independence: This test determines whether there is a significant association between two categorical variables.

  6. 9.6: Chi-Square Tests

    Computational Exercises. In each of the following exercises, specify the number of degrees of freedom of the chi-square statistic, give the value of the statistic and compute the P -value of the test. A coin is tossed 100 times, resulting in 55 heads. Test the null hypothesis that the coin is fair.

  7. What Is Chi Square Test & How To Calculate Formula Equation

    χ2 (degrees of freedom, N = sample size) = chi-square statistic value, p = p value. In the case of the above example, the results would be written as follows: A chi-square test of independence showed that there was a significant association between gender and post-graduation education plans, χ2 (4, N = 101) = 54.50, p < .001. APA Style Rules

  8. S.4 Chi-Square Tests

    The two categorical variables are dependent. Chi-Square Test Statistic. χ 2 = ∑ ( O − E) 2 / E. where O represents the observed frequency. E is the expected frequency under the null hypothesis and computed by: E = row total × column total sample size. We will compare the value of the test statistic to the critical value of χ α 2 with ...

  9. Chi-square statistic for hypothesis testing

    And we got a chi-squared value. Our chi-squared statistic was six. So this right over here tells us the probability of getting a 6.25 or greater for our chi-squared value is 10%. If we go back to this chart, we just learned that this probability from 6.25 and up, when we have three degrees of freedom, that this right over here is 10%.

  10. The Chi-Square Test

    The basic idea behind the test is to compare the observed values in your data to the expected values that you would see if the null hypothesis is true. There are two commonly used Chi-square tests: the Chi-square goodness of fit test and the Chi-square test of independence. Both tests involve variables that divide your data into categories.

  11. Chi-Square Test of Independence: Definition, Formula, and Example

    A Chi-Square test of independence uses the following null and alternative hypotheses: H0: (null hypothesis) The two variables are independent. H1: (alternative hypothesis) The two variables are not independent. (i.e. they are associated) We use the following formula to calculate the Chi-Square test statistic X2: X2 = Σ (O-E)2 / E.

  12. Chi Square Test

    But once you know what hypothesis testing is, you will be able to build on that foundation to understand many different kinds of hypothesis testing, for example, chi-square test, t-test, Z test ...

  13. PDF The Chi Square Test

    Uses of the Chi-Square Test One of the most useful properties of the chi-square test is that it tests the null hypothesis "the row and column variables are not related to each other" whenever this hypothesis makes sense for a two-way variable. Uses of the Chi-Square Test Use the chi-square test to test the null hypothesis H 0

  14. 11.2: Chi-Square One-Sample Goodness-of-Fit Tests

    O O. E E. Here is the test statistic for the general hypothesis based on Table 11.2.5 11.2. 5, together with the conditions that it follow a chi-square distribution. Test Statistic for Testing Goodness of Fit to a Discrete Probability Distribution. χ2 = ∑ (O − E)2 E χ 2 = ∑ ( O − E) 2 E. where the sum is over all the rows of the table ...

  15. Chi-squared test

    Chi-squared distribution, showing χ 2 on the x-axis and p-value (right tail probability) on the y-axis.. A chi-squared test (also chi-square or χ 2 test) is a statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine whether two categorical variables (two dimensions of the contingency table ...

  16. Chi-Square Test

    By the supposition of independence under the hypothesis, we should "expect" the number of doctors in neighbourhood P is; 150 x 349/650 ≈ 80.54. So by the chi-square test formula for that particular cell in the table, we get; (Observed - Expected) 2 /Expected Value = (90-80.54) 2 /80.54 ≈ 1.11.

  17. Pearson's chi square test (goodness of fit)

    Chi-square goodness-of-fit example. Expected counts in a goodness-of-fit test ... Sal uses the chi square test to the hypothesis that the owner's distribution is correct. ... the sum of 6 squared samples is k = 5, because now we must consider "degrees of freedom." If this is the case, then a chi squared test based on two squared differences of ...

  18. PDF Hypothesis Testing with Chi-Square post,

    the chi-square value of the first cell is [(14 - 10) 2/10 = 4 /10 = 16/10 =] 1.60. Calculating chi-square for all of the cells yields 8.97, as shown in Table 11.3. Of course, the value of chi-square is usually calculated by computer.1 (It should be noted that additional examples of chi-square calculations can be found online, e.g., at Khan ...

  19. Using Chi-Square Statistic in Research

    The Chi-Square test looks at the numbers in this table in two steps: Expected vs. Observed: First, it calculates what the numbers in each cell of the table would be if there were no relationship between the variables—these are the expected counts. Then, it compares these expected counts to the actual counts (observed) in your data.

  20. Chi Square Test

    Identify the distribution that the test statistic follows under the null hypothesis. For example, the test statistic in a chi-square test follows a chi-square distribution, while a t-test statistic follows a t-distribution. Step 4: Find the P-Value

  21. 11.E: The Chi-Square Distribution (Exercises)

    10.1. Suppose that a random sample of 400 U.S. young adult males, 18 to 24 years old, yielded the following frequency distribution. We are interested in whether this age group of males fits the distribution of the U.S. adult population. Calculate the frequency one would expect when surveying 400 people. Fill in Table, rounding to two decimal ...

  22. Chi-Square Test

    If null hypothesis is correct then chi-square with its validity will be observed as p value. If it is assumed there is no correlation between the variables then the probability of this data set occurring given what we have seen becomes cleare ... Solved Examples on Chi-Square Test. Example 1: A study investigates the relationship between eye ...

  23. How to Perform a Chi-Square Test by Hand (Step-by-Step)

    Step 4: Find the Critical Value. Next, we need to find the critical value in the Chi-Square distribution table that corresponds to α = .05 and df = (#categories - 1). In this case, there are 6 categories, so we will use df = 6 - 1 = 5. We can see that the critical value is 11.07.

  24. Sample Size's Effect on Chi-Square Test Reliability

    For a chi-square test to be valid, each category's expected frequency must be sufficiently large, typically at least 5. Sample size directly affects these frequencies; a small sample might have ...