Lecture 37
Qi Wang, Department of Statistics
Nov 26, 2018
We want to know if there is a relationship between two qualitative (categorical) variables, if there is no relationship, then the two are considered independent.
We will use the crosstab table to test whether there is a relationship. Returning to our example in previous lecture, suppose we wanted to do the following test:
Note $H_0$ is called the null hypothesis, $H_a$ is the alternative hypothesis
We are using the information in the crosstab table to determine whether the data supports the null or alternative hypothesis. We will conduct a hypothesis test!
$$\alpha = 1 - confidence level$$ Typical confidence levels are 90%, 95% and 99% so $\alpha$ is typically 0.10, 0.05, 0.01.
To test the null hypothesis, compare observed cell counts with expected cell counts calculated under the assumption that the null hypothesis is true. The Chi-square statistic, $\chi^2$, is a measure of how far the observed counts in the two-way table are from the expected counts. The formula for the statistic is $$\chi^2 = \sum{\frac{(\textit{observed cell count} - \textit{expected cell count})^2}{\textit{expected cell count} }}$$ $$\textit{expected cell count} = \frac{\textit{row total}\times \textit{column total}}{\textit{overall total}}$$ $$\textit{observed cell count} = \textit{actual cell count}$$ We sum is over all cells in the table. So to get the overall value of $\chi^2$, calculate each cell’s expected count and each cell’s partial $\chi^2$. Add all the partial $\chi^2$ values for the overall.
IMPORTANT NOTE: We NEVER Accept, and NEVER say Prove. There is always that chance that we are making the incorrect conclusion. Also the decision to reject or not is ALWAYS in terms of the NULL Hypothesis. (e.g. DO NOT say reject $H_a$)
This is considered checking the assumptions, very important to make this check!
Psychological factors and social factors can influence the survival of patients with serious diseases. One study examined the relationship between survival of patients with coronary heart disease and pet ownership. Each of 92 patients was classified as having a pet or not by whether they survived for one year. The researchers suspected that having a pet might be connected to the patient status. Here are the data.
Patient status | NO | YES |
---|---|---|
Alive | 28 | 50 |
Dead | 11 | 3 |
Total | 39 | 53 |