several variables on total winnings of 100 randomly selected
Solution
a) At alpha = 0.05 the variables that are significant are the ones that have p value < 0.05
DriveAcc ( p value = 0.0081) GreensReg (p value = 8.78 * 10-8 ) AvgNumPutts (p value = 0.0053)
Tot Winnings = 18,021.42 - 91.37 DriveAcc + 290.32 GreensReg - 13,745.68 AvgNumPutts
b) Tot Winnings = 18,021.42 - 91.37 * 64.1 + 290.32 * 65.1 - 13,745.68 * 1.749
= 7.023 ($000)
c) alpha = 0.05
y predicted = 7.023
Std Error = 1058.283
n = 100; df = n - (k+1) = 100 - ( 7 + 1 ) = 92
t crit = 1.986
Prediction interval = y predicted + t crit * std error
= 7.023 (000) + 1.986 * 1058.283
= (4921.5 , 9124.9)
d) SavPct
Std Error = 21.63
t crit for alpha = 0.05 and df = (100 - 1) = 99 is + 1.98
coeff = 32.03
Confidence interval = 32.03 + 1.98 * 21.63
= (-10.8 , 74.9)
Since the confidence interval contains 0 we can say it is not significant
e) True - Since p value is small it means that the probability of at least one variable being significant is less than alpha
f) R2 explains the % of variance accounted by the independent variable on the dependent variable
Adjusted R2 has been adjusted for the number of predictors in the model.
If one variable is added R2 will not decrease. It will always increase but adjusted R2 increases only if the new term improves the model more than would be expected by chance.
g) 43.5% Since R2 = 0.435
h) Multicollinearity means that independent variables are correlated and hence it will lead to an inflated R2 value which is incorrect
No it is not a problem since model is significant as p value < 0.05
i) Using residual plots, you can assess whether the observed error (residuals) is consistent with stochastic error.
For this model it is consistent


