NEAS


Group: Administrators
Posts: 4.3K,
Visits: 1.3K

Module 14: Modeling interactions (The attached PDF file has better formatting.) Homework assignment: rating territories and mileage This homework assignment continues the exercise in Module 13. An insurer examines claim frequencies for 15 territories: 5 urban, 5 suburban, and 5 rural. The insurer also has the average miles driven per car in each territory (in thousands). If you live outside the U.S., replace mileage with thousands of kilometers. Urban  Suburban  Rural  Territory  Mileage  Claim Freq’y  Territory  Mileage  Claim Freq’y  Territory  Mileage  Claim Freq’y  1  5  8.45%  6  20  6.99%  11  10  3.83%  2  10  10.90%  7  40  12.94%  12  20  5.06%  3  15  13.45%  8  60  19.01%  13  30  6.00%  4  20  16.04%  9  80  25.06%  14  40  6.94%  5  25  18.49%  10  100  31.11%  15  50  7.92% 
How many dummy variables does this regression use? What are the values of the dummy variables for urban, suburban, and rural? Assume rural is the base territory, with dummy variables equal to zero. Write the regression equation with all interactions. You should have six terms. Use Excel or other statistical software to run the regression. What are the values of the six regression parameters? The claim frequencies are chosen so that the standard error of the regression is small. The observed values are very close to the fitted values, so you can tell if your solution is right. Jacob: Do we have separate dummy variables for each territory? Rachel: This homework assignment replicates the scenario in the textbook. It deals with three regions: urban, suburban, and rural, not 15 separate territories. The territories within each region just differ by average distance driven. The solution has three intercepts and three slopes, giving six regression parameters. Look at the slopes first. The exercise says that the stochasticity of the observed values is small. The claim frequency increases about 2.5 percentage points for each five units of mileage in the urban region, about 6 percentage points for each 20 units of mileage in the suburban region, and about 1 percentage point for each 10 units of mileage in the rural region. The intercepts (where mileage = 0) also differ by region; they are about 6 percentage points in the urban region, about 1 percentage point in the suburban region, and about 3 percentage points in the rural region. Casual observation shows the formulas: claim frequency is 6% + 5% × mileage for urban; 3% + 1% × mileage for rural; 1% + 3% × mileage for suburban. The homework assignment has you solve for the precise figure using Excel (or R or SAS or Mathlab). Rural is the base, so the rural intercept and slope applies to urban and suburban as well. But urban and suburban (the two dummy variables) has different intercepts and slopes. Excel shows the differences are additions or subtractions to the intercept and slopes: +3 and –2 for the intercepts and +4 and +2 for the slopes. For the homework assignment, set up the equations and solve them in Excel. The answers differ from the round numbers above by small amounts, and the p values are all significant at the 0.1% level.



oliver


Group: Forum Members
Posts: 3,
Visits: 1

Can someone provide instructions on how to do this regression in Excel? I've never used the statistics package and am unsure how to get the values of the regression parameters. Thanks!!! [NEAS: The attached PDF file should get you started.]



wangxy


Group: Forum Members
Posts: 7,
Visits: 1

Which addin funtions did you guys use for dummy variables? I downloaded a couple of addins for the dummy variable, non of them worked well. [NEAS: See next post. Code the dummy variable as 0 or 1.]



CalLadyQED


Group: Forum Members
Posts: 62,
Visits: 2

For Excel? You have to add in the Analysis ToolPak. Then go to Data> Anaylsis> Data Anaylsis > Regression > OK. Select the column with your Yi's for "Input Y Range" and the column(s) with your 1s and 0s for "Input X Range." If you include the headers (e.g. D1, D2, Y, etc.) select "Labels". Usually easiest to choose "New Worksheet Ply:" for the output. Click "OK." [NEAS: Correct. The analysis tookpak will help you with many actuarial tasks, not just this course. It does regressions and provides much output. For instructions on loading the Analyis Tookpak for Excel 2010, see the HW discussion thread for Module 13.]



wangxy


Group: Forum Members
Posts: 7,
Visits: 1

Am I misunderstanding something? For this module 14, I thought we got the Xi values as the milleage driven, and Yi values as claim frequency, therefore, my regression model is Y=a+ b1*D1+ b2*D2+ c*X+ c1*D1*X+ c2*D2*X which would give me different intercepts and slopes for each category. If I just put Yi's for "Input Y Range" and the column(s) with 1s and 0s for "Input X Range." this only give me a model the looks like Y=a+ b1*D1+ b2*D2 which was the answer for Module 13 homework. Please help. P.S. The only way I think is easy enough to do is to model the three categories seperately then calculate each coefficients by hand. Am I correct? or I misunderstood what the question is asking. Thank you. [NEAS: Yes, the difference is whether the slopes now differ by category. Excel with the Regression addin will do the modeling.]



CalLadyQED


Group: Forum Members
Posts: 62,
Visits: 2

Did anyone else get this answer for part (D)? Is this correct? alpha = 0.02932 (intercept) gamma1 = 0.02968 (coefficient on urban dummy) gamma2 = 0.02018 (coefficient on suburban dummy) beta = 0.001006 (coefficient on mileage) delta1 = 0.004038 (coefficient on mileageurban interaction) delta2 = 0.002012 (coefficient on mileagesuburban interaction)
[NEAS: We normally do not verify solutions to homework assignments. This assignment has confused some candidates, so we provide more guidance in our comments; see the JacobRachel dialogue below. This solution is correct; make sure you can set up the variables so that Excel gives this solution.]



rcoffman


Group: Forum Members
Posts: 10,
Visits: 1

CalLadyQED: I got the same answers as you. [NEAS: This homework assignment asks you to do a multiple regression in Excel using dummy variables. This problem is simple; use the Analysis Tookpak to check your work.]



bubba gump


Group: Forum Members
Posts: 6,
Visits: 1




CalLadyQED


Group: Forum Members
Posts: 62,
Visits: 2

The main idea of this module is interactions, so you can bet that they need to be included in the model.



Pandia


Group: Forum Members
Posts: 1,
Visits: 1

The method for doing this in excel is hinted in the second footnote at the bottom of page 132 where they say think of a new variable Z=XD.
To set this up, enter in your X and Y values as given. Create your Dummy variable(s), then create another column Z=DX and set its values equal to X*D.
EG. Given y=a + gD + bX + dDX with 4 data points, you would have
Y X D DX 1 2 1 2 2 5 1 5 3 6 0 0 4 7 0 0


