The dataset NavyLaborHourscsv reflects information from 17 U

The dataset NavyLaborHours.csv reflects information from 17 U.S. Navy hospitals at various sites around the world. The predictors are workload variables, that is, items that result in the need for personnel in a hospital. A brief description of the variables is as follows:

y = monthly labor hours

x.1 = average daily patient load

x.2 = monthly X - ray exposures

x.3 = eligible population in the area per 1000

x.4 = average length of patient\'s stay in days

a) (3 points) Provide and describe a scatterplot matrix. Are there any relationships between the response variable and the predictors? Are there any visible relationships between the predictors?

b) (2 points) Fit a least squares regression model using all variables. From the Model Utility F test is there evidence that at least one variable is a significant predictor of monthly labor hours at the US Navy Hospitals? State the F test statistic, degrees of freedom and p-value associated with the test. Assume a significance level of 0.05. Include R summary output.

c) (2 points) How well does the model fit the data? Give the R2 and adjusted R2 values? Which value makes more sense to report and why? Hint: Check out the R2 section on page 559-560 in the text.

d) (2 points) Give the estimated least squares regression equation. Please round your estimates to two decimal places.

e) (1 points) Provide a labor hours prediction for when: x.1= average daily patient load = 94.39 x.2 = monthly X-ray exposures = 8461 x.3 = eligible population in the area per 1000 = 78.7 x.4 = average length of patients’ stay in days = 6.18

f) (1 point) In part e) the observed labor hours for the above predictor values is 1243.90. How far off is the predicted value from the observed value? State the residual.

g) (2 points) Using the output from 2b), check the p-values from the individual t tests on the slopes. Which individual variables are significant at the 0.05 significance level?

h) (3 points) Fit a new model after removing the explanatory variable with the largest non-significant p-value (only one at a time). Continue this process until you have all significant predictors. Take notice of how the p-values change as you remove each variable. What is your new model? Provide the R output for the model summary. Hint: Should reduce to only one explanatory variable.

i) (3 points) Interpret the slope of your model, include a 95% confidence interval for the slope, and interpret.

j) (2 points) Plot the residuals from the model. Are conditions satisfied? Briefly describe the plot.

k) (2 points) How well does the model fit the data? Give the R2 and adjusted R2 ? How do these compare to 2c?

l) (4 points) Provide a scatterplot with a title and the best fit line.

m) (1 point) Provide a monthly labor hours prediction for when (choose the variable that applies to your simple model from 3

n): x.1= average daily patient load = 94.39 x.2 = monthly X-ray exposures = 8461 x.3 = eligible population in the area per 1000 = 78.7 x.4 = average length of patients’ stay in days = 6.18 b) (1 point) In part a. the observed labor hours for the above predictor values is 1243.90. How far off is the predicted value from the observed value? State the residual?

o) (2 points) Calculate the confidence interval and prediction interval for the predicted monthly labor hours for the same value as in 4a. Helpful R code: predict(mod, data.frame(x.1=94.39, x.2 = 8461, x.3= 78.7, x.4=6.18), interval = \"confidence\", level = 0.95) predict(mod, data.frame(x.1 =94.39, x.2 = 8461, x.3= 78.7, x.4=6.18), interval = \"prediction\", level = 0.95) # all you need to change in the code above is the name of your simple model. Here the name is “mod”. # this will plot a confidence/prediction interval for the response given the value of x that applies

p) (4 points) Interpret both of the intervals from 4c. How do they differ numerically and conceptually?

956 2 864 556654 076 4 95877437759787446 367911 2 12652368 11333 2 1132 03s 1 5 2 2 222422 29 20 84 44 85 2 1 2224228892 2 2 123456789 678

b) The Regression equation is

y=a+bx1+cx2+dx3+ex4

y = monthly labor hours

x.1 = average daily patient load

x.2 = monthly X - ray exposures

x.3 = eligible population in the area per 1000

x.4 = average length of patient\'s stay in days

From Excel

Hence The fitted regression is

y=79.09 +12.52 x1 +0.037 x2 -4.86 x3 +18.43 x4

And F cal=73.14 and p value=0.00000008

P < alpha hence the model is significant

C) Coefficient of determination R²=0.96

Adjusted R²=0.95

The fitted regression is

y=79.09 +12.52 x1 +0.037 x2 -4.86 x3 +18.43 x4

when: x.1= average daily patient load = 94.39 x.2 = monthly X-ray exposures = 8461 x.3 = eligible population in the area per 1000 = 78.7 x.4 = average length of patients’ stay in days = 6.18

The fitted regression is

y=79.09 +12.52 x1 +0.037 x2 -4.86 x3 +18.43 x4

y=79.09 +12.52*94.39 +0.037 *8461 -4.86*78.7 +18.43*6.18

y=1305.32

SUMMARY OUTPUT

Regression Statistics
Multiple R	0.980103
R Square	0.960602
Adjusted R Square	0.947469
Standard Error	449.3775
Observations	17

ANOVA
	df	SS	MS	F	Significance F
Regression	4	59083839.44	14770960	73.14523	2.53E-08
Residual	12	2423282.068	201940.2
Total	16	61507121.5

	Coefficients	Standard Error	t Stat	P-value	Lower 95%
Intercept	79.09251	484.2098757	0.163343	0.872967	-975.91
x1	12.52297	1.306324519	9.586415	5.64E-07	9.676733
x2	0.037996	0.042753294	0.888726	0.391617	-0.05516
x3	-4.86374	3.158630287	-1.53983	0.149547	-11.7458
x4	18.43857	88.01621573	0.209491	0.83758	-173.332

The dataset NavyLaborHours.csv reflects information from 17 U.S. Navy hospitals at various sites around the world. The predictors are workload variables, that i

The dataset NavyLaborHourscsv reflects information from 17 U

Solution

Get Help Now

Submit a Take Down Notice