Tests whether two categorical variables are independent or associated, using observed vs expected frequencies in a contingency table.
Test statistic (given in formula booklet)
\( \chi ^2_{calc} = \sum (f_o - f_e)^2 / f_e \)
Expected frequency: \( f_{e} \) = (row total \( \times \) column total) / grand total
Degrees of freedom
\( df = (rows - 1)(columns - 1) \)
Steps: (1) State H0: variables are independent. H1: variables are not independent. (2) Calculate \( \chi ^2_{calc} \) (GDC). (3) Find the \( p \)-value. (4) If \( p \) < significance level \( \to \) reject H0.
GDC: Chi-Squared Test
TI-84 Plus CE
Enter observed data into a matrix: [2ND][MATRIX] → Edit → [A] [STAT] → TESTS → \( \chi ^2 \)-Test
Observed: [A] Expected: [B] press Calculate
Read \( \chi ^2 \) and the \( p \)-value. Check [B] for expected frequencies.
TI-Nspire CX II
Store the observed counts as a matrix (Calculator page: enter the table as a matrix → store to a variable, e.g. obs) [Menu] → Statistics → Stat Tests → \( \chi ^2 \) 2-way Test
Observed Matrix: obs → read \( \chi ^2 \) and the \( p \)-value
Casio fx-CG50
[MENU] → Statistics → TEST → CHI → 2WAY
Enter observed matrix → Execute
Read \( \chi ^2 \) and \( p \). Press [F6] to see expected frequencies.
✗
Common error: All expected frequencies must be \( \geq 5. \) If any \( f_{e} \) < 5, combine categories or note the limitation. The IB expects you to check this.
Worked example: \( \chi ^2 \) test for independence
A survey records preferred sport by gender (2 × 3 table). \( \chi ^2 \)-Test on the GDC gives \( \chi ^2_{calc} = 7.84 \), \( df = (2-1)(3-1) = 2 \), \( p = 0.020 \). Test at the 5% level.
H0: sport preference is independent of gender. H1: they are not independent.
Compare: \( p = 0.020 < 0.05 \).
Reject H0: there is evidence at the 5% level that sport preference depends on gender.
2
Chi-Squared Goodness of Fit SL
Tests whether observed data fits an expected distribution (uniform, given proportions, Poisson, etc.).
Test statistic (same as independence)
\( \chi ^2_{calc} = \sum (f_o - f_e)^2 / f_e \)
Degrees of freedom
\( df = k - 1 - p \)
\( k = \) number of categories, \( p = \) number of estimated parameters (0 if given)
H0: The data follows the proposed distribution. H1: The data does not follow the proposed distribution.
▶
Degrees of freedom: For a fair die with 6 faces, \( df = 6 - 1 = 5 \). If you estimated the mean from the data for a Poisson fit, \( df = k - 1 - 1 = k - 2 \).
HL note: The GOF test itself is SL. Only the extra rigour of grouping numerical data into classes and choosing \( df \) when parameters are estimated from the data (4.12) is HL.
Decision: compare the \( p \)-value to the significance level \( \alpha \). If \( p < \alpha \to \) reject H0 (the means differ).
GDC: 2-Sample \( t \)-Test
TI-84 Plus CE
[STAT] → TESTS → 2-SampTTest
Choose Data or Stats; set \( \mu_1 \): \( \neq \mu_2 \) / \( < \mu_2 \) / \( > \mu_2 \) Pooled: No (IB default unless told variances are equal) → Calculate → read \( t \) and \( p \)
TI-Nspire CX II
[Menu] → Statistics → Stat Tests → 2-Sample \( t \) Test
Choose Data or Stats; set the alternative tail; Pooled: No
Read \( t \) and the \( p \)-value
Casio fx-CG50
[MENU] → Statistics → TEST → t → 2-Sample
Set the alternative (\( \neq \), \( < \), \( > \)); Pooled: Off
Execute → read \( t \) and \( p \)
Worked example: \( t \)-test
Two classes sit the same test. 2-SampTTest (Pooled: No, two-tailed) gives \( t = 2.31 \), \( p = 0.028 \). Test at the 5% level whether the mean scores differ.
Reject H0: evidence at the 5% level that the two class means differ.
✗
Common error: State the correct tail from \( H_1 \). Use Pooled: No unless the question explicitly says the population variances are equal — the IB default is non-pooled.
4
One- & Two-Tailed Tests SL
The alternative hypothesis \( H_1 \) decides the tail. This controls how the \( p \)-value is read off the GDC.
Two-tailed
\( H_1: \mu \neq \mu_0 \)
Tests for any difference (either direction).
One-tailed
\( H_1: \mu > \mu_0 \) or \( \mu < \mu_0 \)
Tests for a difference in one specified direction only.
▶
Set the tail on the GDC. Choose \( \neq \), \( < \) or \( > \) to match \( H_1 \) so the calculator returns the correct \( p \)-value. The \( \chi ^2 \) test is always effectively one comparison: read \( p \) directly and compare to \( \alpha \).
Spearman’s \( r_s \) measures the strength of a monotonic (not necessarily linear) relationship between two ranked variables. \( -1 \leq r_s \leq 1 \).
Spearman’s rank coefficient (given in formula booklet)
\( r_s = 1 - (6 \sum d^2) / (n(n^2 - 1)) \)
\( d = \) difference between ranks, \( n = \) number of data pairs
▶
Tied ranks: If two values are tied, assign the average of the ranks they would occupy. e.g. tied for 3rd and 4th \( \to \) both get rank 3.5.
✗
Common error: Spearman’s tests for monotonic association, not linear. It uses ranks, not raw data. Do not confuse with Pearson’s \( r \) (which measures linear correlation).
▶
GDC shortcut: Rank each variable into two lists, then run PMCC / LinReg on the ranks: the value of \( r \) on the ranked data equals \( r_s \). Faster than the \( \sum d^2 \) formula for larger \( n \).
Worked example: Spearman’s \( r_s \)
Five products are ranked by price and by quality. The rank differences are \( d = 1, -1, 0, 2, -2 \), so \( \sum d^2 = 1 + 1 + 0 + 4 + 4 = 10 \), with \( n = 5 \).
Conditions: Events occur singly, independently, at a constant mean rate, and with no upper limit.
Sum of independent Poissons: if \( X \sim Po(\lambda_1) \) and \( Y \sim Po(\lambda_2) \) are independent, then \( X + Y \sim Po(\lambda_1 + \lambda_2) \).
A confidence interval for the mean of a normal population gives a range of plausible values for \( \mu \).
Interval for \( \mu \) (known \( \sigma \), or large \( n \))
\( \bar{x} - z \times \dfrac{\sigma}{\sqrt{n}} < \mu < \bar{x} + z \times \dfrac{\sigma}{\sqrt{n}} \)
90%: \( z = 1.645 \) 95%: \( z = 1.960 \) 99%: \( z = 2.576 \)
▶
\( \sigma \) unknown? Use the \( t \)-interval (GDC TI-84 STAT → TESTS → TInterval, input \( \bar{x}, s_x, n \), C-Level). Use the \( z \)-interval only when \( \sigma \) is given.
GDC: Z-Interval
TI-84 Plus CE
[STAT] → TESTS → ZInterval
Input: Stats \( \sigma = \) known \( \bar{x} = \) sample mean \( n = \) sample size
C-Level = 0.95 → Calculate
TI-Nspire CX II
[Menu] → Statistics → Confidence Intervals → z Interval
Enter \( \sigma , \bar{x}, n, \) C-Level → read interval
Casio fx-CG50
[MENU] → Statistics → INTR → Z → 1-Sample
Enter C-Level, \( \sigma , \bar{x}, n \) → Execute
✗
Common error: "95% confident" does NOT mean there is a 95% probability that \( \mu \) is in the interval. It means if we repeated the sampling, 95% of intervals would contain \( \mu . \)
Define a critical region from the sampling distribution under H0; reject H0 if the test statistic falls in it. Equivalent to rejecting when \( p < \alpha \).
Mean of a normal population
\( H_0: \mu = \mu_0 \)
GDC Z-Test (\( \sigma \) known) or T-Test (\( \sigma \) unknown) → read \( p \), reject if \( p < \alpha \).
Proportion (binomial)
Under \( H_0 \), \( X \sim B(n, p_0) \)
One-tailed \( p \)-value \( = P(X \geq x) \) or \( P(X \leq x) \) via binomcdf; reject if \( < \alpha \).
Mean (Poisson)
Under \( H_0 \), \( X \sim Po(\lambda_0) \)
\( p \)-value from poissoncdf in the appropriate tail; reject if \( < \alpha \).
Correlation \( \rho = 0 \) (bivariate normal)
\( H_0: \rho = 0 \)
GDC LinRegTTest on the bivariate data → read \( p \); reject “no linear correlation” if \( p < \alpha \).
▶
Critical region vs \( p \)-value: the two approaches agree. The critical value is the boundary of the rejection region; a statistic beyond it is exactly the case \( p < \alpha \).
✗
Common error: Match the tail of the binomial/Poisson \( p \)-value to \( H_1 \). For an upper-tail test use \( P(X \geq x) = 1 - P(X \leq x-1) \), not \( 1 - P(X \leq x) \).
“False positive” — probability = significance level \( \alpha \)
Type II Error
Failing to reject H0 when H0 is false
“False negative” — probability = \( \beta \)
Power \( = 1 - \beta = \) probability of correctly rejecting a false H0. Higher power is better.
▶
Memory aid: Type I = false positIve (I for “Innocent person convicted”). Type II = false negatIIve (“Guilty person goes free”).
✗
Common error: You can NEVER “accept H0”. The correct phrasing is “fail to reject H0” or “insufficient evidence to reject H0”.
✗
Common error: Decreasing \( \alpha \) (e.g. from 5% to 1%) reduces Type I errors but increases Type II errors. There is always a trade-off.
10
Exam Traps & Key Reminders
✗
State hypotheses in context. Do not write generic H0/H1. e.g. “H0: Grade and gender are independent” not just “H0: the variables are independent”.
✗
Expected frequencies \( \geq 5 \). State this check explicitly in \( \chi ^2 \) tests. If violated, combine adjacent categories.
✗
Compare the \( p \)-value to \( \alpha \), not \( \chi ^2 \) to \( \alpha \). Write “\( p = 0.023 < 0.05 \) → reject H0”. The IB awards marks for this comparison step.
▶
Conclusion in context: After rejecting or failing to reject, state the conclusion in the language of the problem. “There is sufficient evidence at the 5% level that grade and gender are not independent.”
▶
Default significance level: If the question does not specify, use \( \alpha = 0.05 \) (5%). This is standard in the IB.
▶
Formula booklet: The \( \chi ^2 \) statistic formula, Spearman’s \( r_{s} \) formula, Poisson probability formula, and confidence interval formula are all given. Know when to use each one.
Confirm Action
This website uses essential cookies for authentication and security. We respect your privacy and comply with GDPR.
By continuing, you accept our Privacy Policy.
Cookie details
Cookie Information
Essential Cookies
JMaths uses only essential cookies required for the platform to function properly:
Session Cookies
Keep you logged in while using the platform. Deleted when you close your browser.
CSRF Protection
Prevent cross-site request forgery attacks for your security.
Performance Cache
Help load pages faster by storing temporary data.
Privacy-Friendly Analytics
Plausible Analytics
We use Plausible, a privacy-first analytics tool that doesn't use cookies and doesn't collect any personal information. It only tracks anonymous page visits to help improve the platform.
Privacy First: Our analytics are fully GDPR, COPPA, and FERPA compliant. No personal data is collected, and no cookies are used for tracking. We do not use advertising or third-party tracking cookies.
Managing Cookies
You can control cookies through your browser settings, but disabling essential cookies
may prevent the platform from working correctly. For educational use, we recommend
keeping these essential cookies enabled.