The following data were obtained in a study of the relation
The following data were obtained in a study of the relation between diastolic blood pressure and age for boys 5 to 13 years old:
i: 1 2 3 4 5 6 7 8
x: 5 8 11 7 13 12 12 6
y: 63 67 74 64 75 69 90 60
a. assuming normal error regression model (2.1) is appropriate, obtain the estimated regression function and plot the residuals ei against Xi. What does your residual plot show? (please show R code)
b) Omit case 7 from the data and obtain estimated regression function based on the remaining seven cases. Compare the estimated regression function to that obtained in part a. What can you conclude about the effect of case 7?
c) Using your fitted regression function in part b, obtain a 99 percent prediction interval for a new Y observation at X = 12. Does observation Y7 fall outside this prediction interval? What is the significance of this?
Solution
The equation of the regression line is:
y = 48.667 + 2.333x
The graph of the regression line is:
Step 2: Find the sum of every column:
X=74 , Y=562 , XY=5356 , X2=752
Step 3: Use the following equations to find a and b:
ab=YX2XXYnX2(X)2=562752745356875274248.667=nXYXYnX2(X)2=85356745628752(74)22.333
-----------------------------------------------------------------------------------
If 7 is omitted,
99% confidence interval: 1.68718 2.89682
When x =12, y = 49.208+2.292(12)
= 76.712
y(12) = (49.208+1.68718(12), 49.208+2.89682(12))
= (69.452, 83.972)
------------------------------------------
y(7) = 49.208+2.292(7)
= 65.252
99% conf int = (60.968, 69.487)
64 lies very much within the interval.
| X | Y | XY | XX |
| 5 | 63 | 315 | 25 |
| 8 | 67 | 536 | 64 |
| 11 | 74 | 814 | 121 |
| 7 | 64 | 448 | 49 |
| 13 | 75 | 975 | 169 |
| 12 | 69 | 828 | 144 |
| 12 | 90 | 1080 | 144 |
| 6 | 60 | 360 | 36 |

