## Fox Module 16 analysis of variance: explanation of Duncan’s prestige...

Author
Message
NEAS
Supreme Being

Posts: 4.3K, Visits: 1.3K

Fox Module 16 analysis of variance:explanation of Duncan’s prestige data

(The attached PDF file has betterformatting.)

This file explains one-way analysis ofvariance on pages 147-148 of the Fox textbook. Final exam problems compute R2and F-statistics from the TSS, RSS, and RegSS.

Chapter 8 of the Fox textbook usesDuncan’s prestige data to illustrate a one-way ANOVA analysis. The data shows45 occupations, with four attributes:

Type of occupation:professional or managerial, white collar, and blue collar.

Average income

Education

Measure of prestige

Income, education, and prestige arescaled from 0 to 100.

The table below shows the data byoccupation. The Excel workbook attached to this posting shows all the figuresfor the one-way analysis of variance. The computations are straight-forward;the final exam problems test these computations on a small data set.

 Occupation Type Income Education Prestige accountant prof 62 86 82 pilot prof 72 76 83 architect prof 75 92 90 author prof 55 90 76 chemist prof 64 86 90 minister prof 21 84 87 professor prof 64 93 93 dentist prof 80 100 90 reporter wc 67 87 52 engineer prof 72 86 88 undertaker prof 42 74 57 lawyer prof 76 98 89 physician prof 76 97 97 welfare.worker prof 41 84 59 teacher prof 48 91 73 conductor wc 76 34 38 contractor prof 53 45 76 factory.owner prof 60 56 81 store.manager prof 42 44 45 banker prof 78 82 92 bookkeeper wc 29 72 39 mail.carrier wc 48 55 34 insurance.agent wc 55 71 41 store.clerk wc 29 50 16 carpenter bc 21 23 33 electrician bc 47 39 53 RR.engineer bc 81 28 67 machinist bc 36 32 57 auto.repairman bc 22 22 26 plumber bc 44 25 29 gas.stn.attendantbc bc 15 29 10 coal.miner bc 7 7 15 streetcar.motormanbc bc 42 26 19 taxi.driver bc 9 19 10 truck.driver bc 21 15 13 machine.operator bc 21 20 24 barber bc 16 26 20 bartender bc 16 28 7 shoe.shiner bc 9 17 3 cook bc 14 22 16 soda.clerk bc 12 30 6 watchman bc 17 25 11 janitor bc 7 20 8 policeman bc 34 47 41 waiter bc 8 32 10

The one-way ANOVA analysis testswhether the type of occupation affects prestige. Professional occupations havehigher prestige than white collar or blue collar; we test if the differencesare statistically significant.

Jacob: Prestige depends on education andincome, not type of occupation. The highest blue collar prestige level (67) isfor RR engineer, which also has the highest blue collar income (81). Thehighest professional prestige level is for physicians (97), who have the thirdhighest education. In Duncan’s study, dentists and lawyers have highereducation, but this is probably measurement error: medical school along withinternship and residency is longer for doctors than for dentists or lawyers.

Rachel: You are correct; the full ANOVAanalysis considers also income and education. One-way ANOVA considers a simplerquestion: does prestige differ by type of occupation? Our goal is to explainthe statistical technique. Education and income affect prestige, and thissimple analysis is not complete.

The explanatory variable is type ofoccupation; the response variable is prestige. Regression analysis assumes theresponse variable has a normal distribution. But prestige is a value from 0 to100; it does not have a normal distribution. We transform prestige to logit(prestige/ 100). The transformed response variable is closer to a normal distribution.

Jacob: How do we test if the responsevariable has a normal distribution?

Rachel: We use QQ plots. The QQ plot forprestige is thin-tailed; the QQ plot for logit(prestige/100) fits betterto a normal distribution.

The textbook shows the computationsfor both prestige and logit(prestige/100).

The overall mean prestige is 47.68889.The mean prestige by type of occupation is

Professional: 80.44444

White collar: 36.66667

Blue collar: 22.76190

The total sum of squares (TSS) is thesquare of the (prestige minus the average prestige). For accountants, this is(82 – 47.68889)2 = 1,177.25

The residual sum of squares (RSS) isthe square of the (prestige minus the average prestige for that type of occupation).For accountants, this is (82 – 80.44444)2 = 2.42.

The regression sum of squares (RegSS)is the square of the (average prestige for the occupation minus the overallaverage prestige). For accountants, this is (80.44444 – 47.68889)2 =1,072.93.

 Occupation Type Inc Edu TSS Prestige Mn(prs) RSS RegSS accountant prof 62 6 1,177.25 82 80.4444 2.42 1,072.92 pilot prof 72 76 1,246.87 83 80.4444 6.53 1,072.93 architect prof 75 92 1,790.23 90 80.4444 91.31 1,072.93 author prof 55 90 801.52 76 80.4444 19.75 1,072.93 chemist prof 64 86 1,790.23 90 80.4444 91.31 1,072.93 minister prof 21 84 1,545.36 87 80.4444 42.98 1,072.93 professor prof 64 93 2,053.10 93 80.4444 157.64 1,072.93 dentist prof 80 100 1,790.23 90 80.4444 91.31 1,072.93 reporter wc 67 87 18.59 52 36.6667 235.11 121.49 engineer prof 72 86 1,624.99 88 80.4444 57.09 1,072.93 undertaker prof 42 74 86.70 57 80.4444 549.64 1,072.93 lawyer prof 76 98 1,706.61 89 80.4444 73.20 1,072.93 physician prof 76 97 2,431.59 97 80.4444 274.09 1,072.93 welfare.worker prof 41 84 127.94 59 80.4444 459.86 1,072.93 teacher prof 48 91 640.65 73 80.4444 55.42 1,072.93 conductor wc 76 34 93.87 38 36.6667 1.78 121.49 contractor prof 53 45 801.52 76 80.4444 19.75 1,072.93 factory.owner prof 60 56 1,109.63 81 80.4444 0.31 1,072.93 store.manager prof 42 44 7.23 45 80.4444 1,256.31 1,072.93 banker prof 78 82 1,963.47 92 80.4444 133.53 1,072.93 bookkeeper wc 29 72 75.50 39 36.6667 5.44 121.49 mail.carrier wc 48 55 187.39 34 36.6667 7.11 121.49 insurance.agent wc 55 71 44.74 41 36.6667 18.78 121.49 store.clerk wc 29 50 1,004.19 16 36.6667 427.11 121.49 carpenter bc 21 23 215.76 33 22.7619 104.82 621.35 electrician bc 47 39 28.21 53 22.7619 914.34 621.35 RR.engineer bc 81 28 372.92 67 22.7619 1,957.01 621.35 machinist bc 36 32 86.70 57 22.7619 1,172.25 621.35 auto.repairman bc 22 22 470.41 26 22.7619 10.49 621.35 plumber bc 44 25 349.27 29 22.7619 38.91 621.35 gas.stn.attendantbc bc 15 29 1,420.45 10 22.7619 162.87 621.35 coal.miner bc 7 7 1,068.56 15 22.7619 60.25 621.35 streetcar.motormanbc bc 42 26 823.05 19 22.7619 14.15 621.35 taxi.driver bc 9 19 1,420.45 10 22.7619 162.87 621.35 truck.driver bc 21 15 1,203.32 13 22.7619 95.29 621.35 machine.operator bc 21 20 561.16 24 22.7619 1.53 621.35 barber bc 16 26 766.67 20 22.7619 7.63 621.35 bartender bc 16 28 1,655.59 7 22.7619 248.44 621.35 shoe.shiner bc 9 17 1,997.10 3 22.7619 390.53 621.35 cook bc 14 22 1,004.19 16 22.7619 45.72 621.35 soda.clerk bc 12 30 1,737.96 6 22.7619 280.96 621.35 watchman bc 17 25 1,346.07 11 22.7619 138.34 621.35 janitor bc 7 20 1,575.21 8 22.7619 217.91 621.35 policeman bc 34 47 44.74 41 22.7619 332.63 621.35 waiter bc 8 32 1,420.45 10 22.7619 162.87 621.35 Total / average 43,687.64 47.6889 10,597.59 33,090.05

The total / average row shows that TSS(43,687.64) = RSS (10,597.59) + RegSS (33,090.05).

The prestige scores do not have anormal distribution. For a better ANOVA analysis, Fox uses the logit of the prestigescores divided by 100. Let Pr = prestige / 100, so logit (Pr) = ln(Pr)/ (1 – ln(Pr) ). We dot show the analysis of variance table forunadjusted prestige levels, though you can compute them easily from the lastrow of the table.

Logit of (Prestige / 100)

We form the same table using logit(prestige / 100). The attached Excel workbook has the same figures.

 Occupation Type I E TSS Pres logit(Pr) Mn(pr) RegSS RSS accountant prof 62.00 6.00 2.66960 82 1.5163 1.632114 3.06130 0.01340 pilot prof 72.00 76.00 2.90079 83 1.5856 1.632114 3.06130 0.00216 architect prof 75.00 92.00 5.35815 90 2.1972 1.632114 3.06130 0.31935 author prof 55.00 90.00 1.61347 76 1.1527 1.632114 3.06130 0.22986 chemist prof 64.00 86.00 5.35815 90 2.1972 1.632114 3.06130 0.31935 minister prof 21.00 84.00 4.07435 87 1.9010 1.632114 3.06130 0.07228 professor prof 64.00 93.00 7.31287 93 2.5867 1.632114 3.06130 0.91121 dentist prof 80.00 100.00 5.35815 90 2.1972 1.632114 3.06130 0.31935 reporter wc 67.00 87.00 0.03904 52 0.0800 -0.590384 0.22358 0.44947 engineer prof 72.00 86.00 4.45199 88 1.9924 1.632114 3.06130 0.12983 undertaker prof 42.00 74.00 0.15952 57 0.2819 1.632114 3.06130 1.82321 lawyer prof 76.00 98.00 4.87652 89 2.0907 1.632114 3.06130 0.21034 physician prof 76.00 97.00 12.91426 97 3.4761 1.632114 3.06130 3.40028 welfare.worker prof 41.00 84.00 0.23185 59 0.3640 1.632114 3.06130 1.60820 teacher prof 48.00 91.00 1.23691 73 0.9946 1.632114 3.06130 0.40640 conductor wc 76.00 34.00 0.13839 38 -0.4895 -0.590384 0.22358 0.01017 contractor prof 53.00 45.00 1.61347 76 1.1527 1.632114 3.06130 0.22986 factory.owner prof 60.00 56.00 2.45722 81 1.4500 1.632114 3.06130 0.03316 store.manager prof 42.00 44.00 0.00691 45 -0.2007 1.632114 3.06130 3.35910 banker prof 78.00 82.00 6.55304 92 2.4423 1.632114 3.06130 0.65648 bookkeeper wc 29.00 72.00 0.10875 39 -0.4473 -0.590384 0.22358 0.02047 mail.carrier wc 48.00 55.00 0.29784 34 -0.6633 -0.590384 0.22358 0.00532 insurance.agent wc 55.00 71.00 0.06072 41 -0.3640 -0.590384 0.22358 0.05127 store.clerk wc 29.00 50.00 2.37371 16 -1.6582 -0.590384 0.22358 1.14029 carpenter bc 21.00 23.00 0.34886 33 -0.7082 -1.482151 1.86216 0.59902 electrician bc 47.00 39.00 0.05650 53 0.1201 -1.482151 1.86216 2.56735 RR.engineer bc 81.00 28.00 0.68183 67 0.7082 -1.482151 1.86216 4.79757 machinist bc 36 32 0.15952 57 0.2819 -1.482151 1.86216 3.11171 auto.repairman bc 22 22 0.86197 26 -1.0460 -1.482151 1.86216 0.19026 plumber bc 44 25 0.60504 29 -0.8954 -1.482151 1.86216 0.34430 gas.stn.attendantbc bc 15 29 4.32508 10 -2.1972 -1.482151 1.86216 0.51133 coal.miner bc 7 7 2.61488 15 -1.7346 -1.482151 1.86216 0.06373 streetcar.motormanbc bc 42 26 1.77547 19 -1.4500 -1.482151 1.86216 0.00103 taxi.driver bc 9 19 4.32508 10 -2.1972 -1.482151 1.86216 0.51133 truck.driver bc 21 15 3.18057 13 -1.9010 -1.482151 1.86216 0.17540 machine.operator bc 21 20 1.07151 24 -1.1527 -1.482151 1.86216 0.10855 barber bc 16 26 1.60973 20 -1.3863 -1.482151 1.86216 0.00919 bartender bc 16 28 6.09668 7 -2.5867 -1.482151 1.86216 1.22000 shoe.shiner bc 9 17 11.27990 3 -3.4761 -1.482151 1.86216 3.97583 cook bc 14 22 2.37371 16 -1.6582 -1.482151 1.86216 0.03100 soda.clerk bc 12 30 6.93792 6 -2.7515 -1.482151 1.86216 1.61134 watchman bc 17 25 3.89351 11 -2.0907 -1.482151 1.86216 0.37038 janitor bc 7 20 5.40471 8 -2.4423 -1.482151 1.86216 0.92198 policeman bc 34 47 0.06072 41 -0.3640 -1.482151 1.86216 1.25034 waiter bc 8 32 4.32508 10 -2.1972 -1.482151 1.86216 0.51133 Total / avg 134.15390 47.689 -0.1175 95.55014 38.60376

These tables show how Fox calculatedthe figures on page 148. Fox doesn’t show all the work; the tables here showall the computations.

The logit of Prestige/ 100 for accountants is ln(0.82 / (1 – 0.82) ) = 1.51635.

The average logit forall occupations is –0.11754.

The average logit bytype of occupation is 1.6321 for professional, –0.5791 for white collar, and–14821 for blue collar.

The regression sum ofsquares (RegSS) for accountants is (1.6321 – –0.11754)2 = 3.06124.

The residual sum ofsquares (RSS) for accountants is (1.6321 – 1.5163)2 = 0.01341.

The total row in the table forms theanalysis of variance. Fox shows the following table:

 Source Sum of Squares Degrees of Freedom Mean Square F p Groups 95.550 2 47.775 51.98 << 0.001 Residuals 38.604 42 0.919 Total 134.154 44

The mean square isthe sum of squares divided by the degrees of freedom.

95.550 / 2 = 47.775;38.604 / 42 = 0.9191.

The F-statistic isthe mean square for the groups divided by the mean square of the residuals.

47.775 / 0.9191 =51.98

The R2 isthe sum of squares for the groups (RegSS) divided by the total sum of squares(TSS).

95.550 / 134.154 =71.22%

Jacob: How do we get the degrees of freedom?

Rachel: 45 data points minus 1 parameter (themean) = 44 degrees of freedom for the total sum of squares.

Three occupation types (groups) minusone relation = 2 degrees of freedom for the groups. The relation is that anoccupation is either professional, white collar, or blue collar.

Degrees of freedom for TSS – degreesof freedom for RegSS = degrees of freedom for RSS.

Attachments
Edited 11 Years Ago by NEAS
scomurphy
Forum Newbie

Group: Forum Members
Posts: 6, Visits: 148
Going from the occupational table to the type of occupation table, why is the sum of the residual sum of squares used as the RegSS for the group mean square, and on the other side the RegSS of the occupational table is used as the RSS for the Residual Mean Square?

[NEAS: Thank you for noticing the typo: the RSS and RegSScolumn headings were reversed on one of the exhibits, though all the figureswere correct and the final F-statistic was correctly computed.

Theregression sum of squares is square of the group mean minus the overall mean.

The residualsum of squares is the square of the individual value minus the group mean.

The post hasbeen corrected and re-posted.]

Edited 11 Years Ago by NEAS
##### Merge Selected
Merge into selected topic...

Merge into merge target...

Merge into a specific topic ID...