In R analyze and interpret the effect of explanatory variabl

In R, analyze and interpret the effect of explanatory variables on the milk intake (dl.milk) in the kfm data set (ISwR) using a multiple regression model Test by using ALPHA = 0.05.

1) Run regression for( dl.milk )on all other variables. Do you find any significance that milk intake can be explained by other variables?

2) find regression models in which fewer explanation variables should be used. i.e., select a subset of variables so that a better fit can be achieved.

Solution

Solution:

load the library ISwR

load the dataset kfm

There are 50 observations and 7 variables

variables are

\"no\" \"dl.milk\" \"sex\" \"weight\" \"ml.suppl\" \"mat.weight\" \"mat.height\"

code the variable sex boys to 0 and girls to 1

dl.milk is dependent variable

R Code:

library(ISwR)
print(kfm)
dim(kfm)
names(kfm)
require(dplyr)
kfm1 <- kfm %>%
mutate(sex = ifelse(sex == \"girl\",0,1))
head(kfm1)
kfm1 <- kfm1[,-1]
rgmod1 <- lm(dl.milk~.,data=kfm1)
summary(rgmod1)
coefficients(rgmod1)

output is:

> summary(rgmod1)

Call:

lm(formula = dl.milk ~ ., data = kfm1)

Residuals:

Min 1Q Median 3Q Max

-1.74201 -0.81173 -0.00926 0.78326 2.52646

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) -12.181372 4.322605 -2.818 0.007212 **

sex 0.499532 0.312672 1.598 0.117284   

weight 1.349124 0.322450 4.184 0.000135 ***

ml.suppl -0.002233 0.001241 -1.799 0.078829 .  

mat.weight 0.006212 0.023708 0.262 0.794535   

mat.height 0.072278 0.030169 2.396 0.020906 *  

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.075 on 44 degrees of freedom

Multiple R-squared: 0.5459, Adjusted R-squared: 0.4943

F-statistic: 10.58 on 5 and 44 DF, p-value: 1.03e-06

intrepretation

From summary we can see for weight and mat.height are significant variables as p<0.05

other variables sex ,ml.suppl ,mat.weight are not significant variables as p>0.05

F(5,44)=10.58

p=0.0000

p<0.05

Model is significant.

We can use model for predicting  dl.milk.

Regression model

dl.milk=-12.181371613 +0.4995321988*sex+1.349124010*weight-0.002232952 * ml.suppl + 0.006211857* mat.weight+0.072278226 * mat.height

Solution2:

Now exclude the insignificant variables and run model with significant variables:

Rcode:


rgmod2 <- lm(dl.milk~weight+mat.height,data=kfm1)

summary(rgmod2)

output:

Min 1Q Median 3Q Max

-2.19598 -0.82149 0.01822 0.75582 2.83375

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) -11.92014 4.07325 -2.926 0.00527 **

weight 1.42862 0.31338 4.559 3.67e-05 ***

mat.height 0.07063 0.02636 2.680 0.01013 *  

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.109 on 47 degrees of freedom

Multiple R-squared: 0.4835, Adjusted R-squared: 0.4615

F-statistic: 22 on 2 and 47 DF, p-value: 1.811e-0

Intrepretation :

r sq=0.4835

48.35% variation in dl.milk is explained by model.

Rest 51.65% is unexplained variation.

F(2,47)=22

p=0.0000

p<0.05 model is significant.

Final regression model is

(Intercept) weight mat.height
-11.92014253 1.42862096 0.07062876

dl.milk=-11.92014253 +1.42862096*weight+ 0.07062876 *mat.height


Get Help Now

Submit a Take Down Notice

Tutor
Tutor: Dr Jack
Most rated tutor on our site