The following data represent the age in years of various peo
The following data represent the age (in years) of various people and the number of days per week they exercise. We are interested in doing a regression analysis on this data to see if age affects how many days per week someone works out. Please show your work for your answers.
Age Days they Exercise
24 4
18 6
29 6
17 4
61 5
51 2
30 6
24 4
18 2
21 1
a. Which one is the independent variable and which one is the dependent variable?
b. Find b1.
c. Find b0.
d. Find SST.
e. Find SSR.
f. Find SSE.
g. Find the coefficient of determination.
h. Find the correlation coefficient.
i. Find s2.
j. Find the test statistic for testing if b1 is significant or not.
k. What conclusion would you make based on the test statistic found above?
l. Find the 95% confidence interval for B1.
m. Assuming a person is 40 years old, how many days per week are they expected/predicted to exercise?
n. Assuming a person is 40 years old, what is the 90% confidence interval for the expected number of days of exercise for them?
o. Assuming a person is 40 years old, what is the 90% prediction interval for the expected number of days of exercise for them?
Solution
The following data represent the age (in years) of various people and the number of days per week they exercise. We are interested in doing a regression analysis on this data to see if age affects how many days per week someone works out.
Let dependent variable X is age.
and independent variable Y is days they exercise.
We can obtain regression output by using MINITAB.
steps :
STAT -->regression --> regression -->response :Y -->predictors :X --> Result : second option --> ok
This will gives us following output.
Regression Analysis: Y versus X
The regression equation is
 Y = 3.80 + 0.0070 X
 Predictor Coef SE Coef T P
 Constant 3.796 1.404 2.70 0.027
 X 0.00697 0.04314 0.16 0.876
 S = 1.93334 R-Sq = 0.3% R-Sq(adj) = 0.0%
 Analysis of Variance
Source DF SS MS F P
 Regression 1 0.098 0.098 0.03 0.876
 Residual Error 8 29.902 3.738
 Total 9 30.000
b1 = slope = oefficient of X = 0.0070
b0 = constant = 3.80
SST = total summ of squares = 30.000
SSR = regression sum of squares = 0.098
SSE = error sum of squares = 29.902
coefficient of determination (R2) = 0.3% = 0.3/100 = 0.003
correlation coefficient (r) = sqrt(0.003) = 0.05477
The test of hypothesis is,
H0: B1=0 Vs H1 : B1 not=0
the test statistic for testing if b1 is = 0.16
P-value = 0.876
Alpha = 0.05
p-value > alpha
Accept H0 at 5% level of significance.
Conclusion :slope is 0.
SEb = 0.16
critical value (tc) = 2.3060
The confidence interval for B1 is b - E < B1 < b + E
where E is the margin of error.
E = tc * SEb = 2.3060 * 0.16 = 0.3690
lower limit = b - E = 0.0070 - 0.3690 = - 0.362
upper limit = b + E = 0.0070 + 0.3690 = 0.376
assume that x = 40,
y = 3.80 + 0.0070* x
y = 3.80 + (0.0070*40) = 4.08
Assuming a person is 40 years old, what is the 90% confidence interval for the expected number of days of exercise for them?
when x=40 then y = 4.08
90% confidence interval for y is,
Yp - E < y < Yp + E
Where E = tc*se sqrt [ 1+1/n + (x-xbar)2 / SSx ]
se = sqrt [ (SSy - bSSxy) / n-2
se = 1.9333
E = 4.7955
lower limit = Yp - E = 4.08 - 4.7955 = -0.7155
upper limit = Yp + E = 4.08 + 4.7955 = 8.8755
| n | 10 | 
| xbar | 29.3 | 
| ybar | 4 | 
| SSx | 2008.1 | 
| Ssy | 30 | 
| Ssxy | 14 | 
| b | 0.006972 | 
| a | 3.795727 | 
| 3.737799 | |
| se | 1.933339 | 



