1 In an article in Statistics and Computing Carlin and Gelfa
1. In an article in Statistics and Computing Carlin and Gelfand investigated the age (x) and length (y) of 27 captured dugongs (sea cows).
x =1.0, 1.5, 1.5, 1.5, 2.5, 4.0, 5.0, 5.0, 7.0, 8.0, 8.5, 9.0, 9.5, 9.5, 10.0, 12.0, 12.0, 13.0, 13.0, 14.5, 15.5, 15.5, 16.5, 17.0, 22.5, 29.0, 31.5
y =1.80, 1.85, 1.87, 1.77, 2.02, 2.27, 2.15, 2.26, 2.47, 2.19, 2.26, 2.40, 2.39, 2.41, 2.50, 2.32, 2.32, 2.43, 2.47, 2.56, 2.65, 2.47, 2.64, 2.56, 2.70, 2.72, 2.57
(a) Find the least squares estimates of the slope and the intercept in the simple linear regression model. Find an estimate of 2 . Hint: Use R to load the data and the regression function lm to find the least squres estimates and an estimate of 2 . R code is list below:
x=c(1.0, 1.5, 1.5, 1.5, 2.5, 4.0, 5.0, 5.0, 7.0, 8.0, 8.5, 9.0, 9.5, 9.5, 10.0, 12.0, 12.0, 13.0, 13.0, 14.5, 15.5, 15.5, 16.5, 17.0, 22.5, 29.0, 31.5)
y=c(1.80, 1.85, 1.87, 1.77, 2.02, 2.27, 2.15, 2.26, 2.47, 2.19,2.26, 2.40, 2.39, 2.41, 2.50, 2.32, 2.32, 2.43, 2.47, 2.56, 2.65, 2.47, 2.64, 2.56, 2.70, 2.72, 2.57)
fit=lm(y~x)
summary(fit)
(b) Estimate the mean length of dugongs at age 11.
(c) Based on the R outputs, test the hypotheses H0 : 1 = 0 versus H1 : 1 = 0.
(d) What is the coefficient of determination, R2 ? (hint: in R, it is called “Multiple R-squared”)
(e) What is the correlation between the response (y) and predictor (x) variables? R code: cor (x,y)
Solution
x=c(1.0, 1.5, 1.5, 1.5, 2.5, 4.0, 5.0, 5.0, 7.0, 8.0, 8.5, 9.0, 9.5, 9.5, 10.0, 12.0, 12.0, 13.0, 13.0, 14.5, 15.5, 15.5, 16.5, 17.0, 22.5, 29.0, 31.5)
y=c(1.80, 1.85, 1.87, 1.77, 2.02, 2.27, 2.15, 2.26, 2.47, 2.19,2.26, 2.40, 2.39, 2.41, 2.50, 2.32, 2.32, 2.43, 2.47, 2.56, 2.65, 2.47, 2.64, 2.56, 2.70, 2.72, 2.57)
fit=lm(y~x)
summary(fit)
matplot(x,cbind(y,predict(fit)),type=\"o\")
R code is given above . Here I have also given a detailed plot of observed against fitted values.
a)
The least square estimate of slope=0.028718
The least square estimate of intercept=2.019769
estimate of variance is 0.274
b)
The linear regression equation of y on x is given as
y=2.019769+0.028718*x
For x=11
we get the estimate as 2.33
d)
Coefficient of determination is the proportion of the variance in the dependent variable that is predictable from the independent variable. The coefficient of variation gives the amount of variation in the observed values that is explained by the regression model
Here R2=0.6642 so here 66.4% of the total variation is explained by the regression model.
e)
Here Correlation between response and the predictor is 0.822
c) For testing H0 : 1 = 0 versus H1 : 1 = 0
Here p value corresponding to the slope is
1.378e-07
Hence we reject H0

