To understand better this type of hypothesis test, let's consider this example.
Example
A marketing research would like to know the relation between preferred products and income of consumers. Meat product and fruit product are two preferred ones. The income of consumers is classified to 3 categories: high, intermediate, and low. We would like to know if the food preference depends on the income of the consumers with confidence level of 95 %.
The result of survey is summarized in Table 1, known as contingency table (or crosstab or two-way table).
| High income | Intermediate income | Low income | Total | |
|---|---|---|---|---|
| Meat product | 18 | 42 | 58 | 118 |
| Fruit product | 42 | 28 | 12 | 82 |
| Total | 60 | 70 | 70 | 200 |
From Table 1, we know that there are 200 consumers participating in the research, in which:
Hence, besides the last column (Total) and the last row (Total), the value `x_(ij)` in a cell of the contingency table is the number of elements corresponds to the value of column variable and the value of row variable. For example 42 is the number of consumers in the intermediate income group that prefer meat product.
For this hypothesis testing, the null hypothesis is:
Ho : All the groups of income have the same preference.
It means that, in all groups, there are always 59% of customers in group preferring meat product.
Based on this hypothesis, we construct Table 2 with the new values `e_(ij)` in cells. These values conform to Ho.
| High income | Intermediate income | Low income | Total | |
|---|---|---|---|---|
| Meat product | 35,4 | 41,3 | 41,3 | 118 |
| Fruit product | 24,6 | 28,7 | 28,7 | 82 |
| Total | 60 | 70 | 70 | 200 |
Test statistic of this hypothesis testing is :
| `chi^2=sum_(i=1)^d sum_(j=1)^c ((x_(ij)-e_(ij))^2)/e_(ij)` | (23) |
This test statistic conforms to chi-square distribution with degree of freedom `nu=(c - 1)(r - 1)` in which `c` and `r` are numbers of columns and rows of contingency table respectively.
This is a one-sided hypothesis test with rejection being in the right of critical value and `alpha=0,05`.
Hence `chi^2`*`=chi_(0,05,2)^2=5,991` (percentage point of chi-square distribution).
Based on the data of survey :
`chi_o^2=(18-35,4)^2/(35,4)+(42-41,3)^2/(41,3)+(58-41,3)^2/(41,3)+(42-24,6)^2/(24,6)`
`+(28-28,7)^2/(28,7)+(12-28,7)^2/(28,7)=37,359`
Because `chi_o^2>chi^2`*, we reject Ho.
Two attributes "food preference" and "income" are not independent. It means that preferred food depends on income of customers.
This web page was last updated on 03 December 2018.