## Module 8: variances and means (practice problems)

 Author Message NEAS Supreme Being         Group: Administrators Posts: 4.2K, Visits: 1.2K Regression analysis Module 8: variances and means (practice problems)(The attached PDF file has better formatting.) ** Exercise 8.1: Variances and meansA regression equation of Y on X, Yi = á + â × Xi + åi, with N=5 observations, hasRSS, the residual sum of squares, = 5.10ó2B, the variance of B, the ordinary least squares estimator of â, = 0.17ó2A, the variance of A, the ordinary least squares estimator of á, = 1.87What is the ó2å, the variance of the error term?What is xi2, the sum of the squared x values?What is (xi – )2, the sum of the squared x residuals?What is , the mean of the X values?Part A:ó2å = RSS / degrees of freedom of the regression equation, which is N – k – 1, where k is the number of explanatory variables: 5.1 / (5 – 1 – 1) = 1.700.Part B: The variance of B, ó2B, is ó2å / (xi – )2, and the variance of A, ó2A, is [ó2å / (xi – )2 ] × [ xi2 / N] xi2 = N × ó2A / ó2B = 5 × 1.87 / 0.17 = 55.Part C: (xi – )2 = ó2å /ó2B = 1.70 / 0.17 = 10.Part D: (xi – )2 = xi2 – 2 xi + N 2 = xi2 – N 2, since xi = N 2, so = { [ xi2 – (xi – )2 ] / N }½ = { [ 55 – 10 ] / 5 }½ = 3** Exercise 8.2: Sampling varianceA regression has N observations, with a standard error of ó2å and a variance of the explanatory variable of S2x.Explain how the following affect the sampling variances of the slope estimate B and the intercept estimate A.ó2åsample size Nvariance of the explanatory variable S2xThe closeness of the X values to zero The formulas for the sampling variances are ó2B = ó2å / (xi – )2 = ó2å / ( (N – 1) × S2x)ó2A = ó2B × ( x2i / N)Part A: As ó2å increases, ó2B and ó2A increase. Intuition: As the standard error of the regression increases, the estimates of the regression coefficients are less certain.Part B: As N increases, ó2B and ó2A decrease. Intuition: With only a few observed values, the estimated regression line is uncertain. Both the slope and the intercept may be distorted by one or two outlying values. With more observed values, the estimated regression line is more certain. Part C: As S2x increases, ó2B and ó2A decrease. Intuition: The regression line passes through (, ), the means of the X and Y values. Think of the regression line as a bar hinged at the point (, ) but with unknown slope. Random fluctuations in the observed values of the response variable Y may distort the slope. If the X values are widely dispersed, some of them are far from the mean . An incorrect slope coefficient causes a large squared error at that point, so incorrect slope coefficients are less likely. An incorrect slope coefficient causes a large error in the intercept, so if the slope coefficient is more accurate, so is the intercept.Part D: The closeness of the X values to zero has no effect on ó2B. B, the estimate of the slope coefficient â, depends on (xi – ), not on xi, so adding a constant to all the x values doesn’t change B. But if is far from zero, an error in the slope coefficient greatly affects the intercept. As ( x2i / N) increases, ó2A increases. Intuition: If = 0, the intercept is , with no uncertainty. No matter what value B has, A is . If is 100, an error of k in the estimate of B causes an error of 100 × k in the estimate of A.** Exercise 8.3: Standard errors of ordinary least squares estimators for á (A) and â (B)A statistician uses a regression on the X values {-1, -0.9, -0.8, -0.7, ..., -0.1, 0, 0.1, …, 0.7, 0.8, 0.9, 1) to test null hypotheses that á = 0 and that â = 0. The ordinary least squares estimators of á and â are both 1.000. Which estimator has the higher standard error? Which estimator has the higher t-value? Which estimator has the higher p-value for the test of the null hypothesis? Part A: We don’t know the standard errors of A or B, since we don’t know the standard error of the regression S2å. But we know the ratios of these standard errors. N (the number of data points) = 21, and x2i = 7.7, so x2i / N = 7.7 / 21 = 0.367. B has the higher standard error. Part B: The t-value is the regression coefficient divided by its standard error. B has the higher standard error, so A has the higher t-value.Part C: A higher t-value means a more significant coefficient so a lower p-value. B has the higher p-value. Attachments Regression analysis variances and means pps df.pdf (1.2K views, 64.00 KB)
##### Merge Selected
Merge into selected topic...

Merge into merge target...

Merge into a specific topic ID...