## MS Mod 22: χ2 tests – practice problems

 Author Message NEAS Supreme Being Group: Administrators Posts: 4.2K, Visits: 1.2K MS Module 22: χ2 tests – practice problems(The attached PDF file has better formatting.)Exercise 22.1: χ2 When Parameters Are EstimatedThe groups of phenotypes, R, S, and T, are in equilibrium if for some θ:●    P(R) = p1 = θ2 ●    P(S) = p2 = 2θ(1–θ) ●    P(T) = p3 = (1–θ)2A sample from a population has the following number of observations in each group:●    Group R: n1 = 145●    Group S: n2 = 235●    Group T: n3 = 120The null hypothesis H0 is that the population is in equilibrium for some parameter θ.A.    What is the maximum likelihood estimate for θ?B.    What are the expected cell counts?C.    What is the χ2 statistic to test the null hypothesis that the population is in equilibrium?D.    What is the p value to test the null hypothesis that the population is in equilibrium?Part A: The likelihood is of the observed values given θ is     [π1(θ)]n1 × [π2(θ)]n2 × [π3(θ)]n3 = [θ2]n1 × [2θ(1–θ)]n2 × [(1–θ)2]n3 = 2n2 × θ2n1 + n2 × 1–θ)n2+2n3Maximizing the loglikelihood (the natural logarithm of the likelihood) with respect to θ yields = (2n1 + n2) / [(2n1 + n2) + (n2 + 2n3)] = (2n1 + n2) / 2n, where n = n1 + n2 + n3 =(2 × 145 + 235) / (2 × 500) = 0.525where n1 = 145, n2 = 235, and n = 500.Part B: The expected cell counts are●    Group R: 500 × 0.5252 = 137.8125●    Group S: 500 × 2 × 0.525 × (1 – 0.525) = 249.3750●    Group T: 500 × (1 – 0.525)2 = 112.8125Part C: The χ2 statistic contributions to test the null hypothesis that the population is in equilibrium is●    Group R: (145 – 137.8125)2 / 137.8125 = 0.374858●    Group S: (235 – 249.375)2 / 249.375 = 0.828634●    Group T: (120 – 112.8125)2 / 112.8125 = 0.457929The χ2 statistic is 0.374858 + 0.828634 + 0.457929 = 1.661421Part D: The p value = 1 – the cumulative distribution function of the χ2 distribution with (3 – 1 – 1) degrees of freedom = 0.197411 (table lookup or spreadsheet function).Jacob: Why are the degrees of freedom = 3 – 1 – 1 = 1?Rachel: The scenario has two constraints:●    The sum of the observations in the groups = the total number of observations. ●    The observations by group satisfy the proportions:     ○    P(R) = p1 = θ2     ○    P(S) = p2 = 2θ(1–θ)     ○    P(T) = p3 = (1–θ)2 Exercise 22.2: Testing for a normal distributionWe draw a sample of 100 points to test whether a population is normally distributed. Before drawing the sample, we assume the population’s mean μ is 8 and its standard deviation σ is 2.We group the sample values into five groups (–∞, k1), (k1, k2), (k2, k3), (k3, k4), (k4, ∞), which have the same expected number of observations if the population ∼ N(8, 22).Summary statistics for the 100 sample values are xi = 840 and xi2 = 7,535.16.The number of sample values in the five groups are 16, 18, 19, 21, and 26.A.    What are the values of k1, k2, k3, and k4?B.    What is the mean of the sample?C.    What is the standard deviation of the sample?D.    What are the percentile bounds for the five groups using the sample mean and standard deviation? E.    What are the expected number of observations in the five groups using the sample mean and the sample standard deviation for the population? F.    What is the χ2 value to test the null hypothesis? G.    How many degrees of freedom does the χ2 value have? H.    What is the p value to test the null hypothesis? Part A: If the population were ∼ N(0.1), the values of k1, k2, k3, and k4 would be●    -0.841621●    -0.253347●    0.253347●    0.841621Since the population is assumed to be ∼ N(8,2), the values of k1, k2, k3, and k4 are●    -0.841621 × 2 + 8 = 6.316758●    -0.253347 × 2 + 8 = 7.493306●    0.253347 × 2 + 8 = 8.506694●    0.841621 × 2 + 8 = 9.683242Part B: The mean of the sample is xi / n = 840 / 100 = 8.4Part C: The variance of the sample is (xi2 – (xi)2/n)/(n-1) =(7,535.16 – 8402/100)/(100 – 1) = 4.84The standard deviation of the sample is 4.840.5 = 2.20Part D: If the population is ∼ N(8.4,2.22), the percentiles for k1, k2, k3, and k4 are●    (6.316758 – 8.4) / 2.2 = -0.946928●    (7.493306 – 8.4) / 2.2 = -0.412134●    (8.506694 – 8.4) / 2.2 = 0.048497●    (9.683242 – 8.4) / 2.2 = 0.583292The bounds for the five groups are●    (–∞, -0.946928)●    (-0.946928, -0.412134)●    (-0.412134, 0.048497)●    (0.048497, 0.583292)●    (0.583292, ∞)Part E: The expected number of observations in the five groups from the sample of 100 values =●    100 × Φ (-0.946928) = 17.183763●    100 × (Φ (-0.412134) – Φ (-0.946928) ) = (34.012070 – 17.183763) = 16.828307●    100 × (Φ (0.048497) – Φ (-0.412134) ) = (51.934007 – 34.012070) = 17.921936●    100 × (Φ (0.583292) – Φ (0.048497) ) = (72.015164 – 51.934007) = 20.081157●    100 × (1 – Φ (0.583292) ) = (100 – 72.015164) = 27.984836Part F: The contribution of each group to the χ2 statistic is●    (16 – 17.183763)2 / 17.183763 = 0.081548●    (18 – 16.828307)2 / 16.828307 = 0.081581●    (19 – 17.921936)2 / 17.921936 = 0.064849●    (21 – 20.081157)2 / 20.081157 = 0.042043●    (26 – 27.984836)2 / 27.984836 = 0.140775The χ2 statistic used to test the null hypothesis that the population is normally distributed =     0.081548 + 0.081581 + 0.064849 + 0.042043 + 0.140775 = 0.410796Part G: The χ2 value has 5 – 1 = 4 degrees of freedom: 5 groups – 1 constraint (the sum of the observations in the five groups = the total observations). Part H: The p value is 1 – the cumulative distribution function of the χ-squared distribution with 4 degrees of freedom at 0.410796 = 0.981584 (table lookup or spreadsheet function).Question: Why is the p value so high?Answer: The actual number of observations by group are close to the expected number of observations. The slight differences presumably stem from rounding and random fluctuations. The total χ2 is much less than the degrees of freedom, so we do not reject the null hypothesis that the population is normally distributed. Exercise 22.3: PhenotypesThe expected proportions of subjects with four phenotypes is 9/16, 3/16, 3/16, and 1/16.The observed values in an experiment are 895, 280, 305, and 120.A.    What are the expected values in each cell?B.    What is the χ2 value to test the null hypothesis? C.    What are the degrees of freedom? D.    What is the p value to test the null hypothesis? Part A: The total subjects = 895 + 280 + 305 + 120 = 1600The expected counts in the four groups are1.    9/16 × 1600 = 9002.    3/16 × 1600 = 3003.    3/16 × 1600 = 3004.    1/16 × 1600 = 100Part B: The χ2 value is the sum of the contributions from the four groups, which are1.    (895 – 900)2 / 900 = 0.02782.    (280 – 300)2 / 300 = 1.33333.    (305 – 300)2 / 300 = 0.08334.    (120 – 100)2 / 100 = 4.0000The χ2 value is 0.0278 + 1.3333 + 0.0833 + 4.000 = 5.4444Part C: The χ2 test has four cells and one constraint (the total actual values = the total expected values), so the degrees of freedom = 4 – 1 = 3.Part D: The p value = 1 – χ2 cdf(5.4444, 3) = 0.142 (table lookup or spreadsheet function). Attachments MS Module 22 chisq tests – practice problems df.pdf (273 views, 70.00 KB) Edited 3 Years Ago by NEAS
##### Merge Selected
Merge into selected topic...

Merge into merge target...

Merge into a specific topic ID...