Let X be the height of a randomly chosen adult man and Y be
Let X be the height of a randomly chosen adult man, and Y be his father’s height, where X and Y have been standardized to have mean 0 and standard deviation 1. Suppose that (X, Y ) is Bivariate Normal, with X, Y N (0, 1) and Corr(X, Y ) = .
(a) Let y = ax + b be the equation of the best line for predicting Y from X (in the sense of minimizing the mean squared error), e.g., if we were to observe X = 1.3 then we would predict that Y is 1.3a + b. Now suppose that we want to use Y to predict X, rather than using X to predict Y. Give and explain an intuitive guess for what the slope is of the best line for predicting X from Y .
(b) Find a constant c (in terms of ) and an r.v. V such that Y = cX + V , with V independent of X. Hint: Start by finding c such that Cov(X, Y cX) = 0.
(c) Find a constant d (in terms of ) and an r.v. W such that X = dY + W, with W independent of Y.
(d) Find E(Y|X) and E(X|Y ).
(e) Reconcile (a) and (d), giving a clear and correct intuitive explanation
Please help me! Thanks in advance.
Solution
corr(X,Y)=p V(X)=V(Y)=1 E[X]=E[Y]=0
hence Cov(x,y)=corr(X,Y)*sqrt(V(X)*V(Y))=p
a) correlation coefficient between two variables denote the extent to which they are related.it is a measure of the association of X and Y.
so if X is predicted by Y then the prediction equation should also reflect the association between X and Y
hence the slope should be the correlation coefficient=p
b) Cov(X,Y-cX)=0
or, Cov(X,Y)-cV(X)=0
or,p-c*1=0
or c=p [answer]
c) W=X-dY
for W and Y be independent
cov(W,Y)=0
or, cov(X-dY,Y)=0
or, cov(X,Y)-dV(Y)=0
or, p-d*1=0
or d=p [answer]
d) to find these expectations first we need to know the conditional distributions.
now Y|X follows a normal distribution with mean px and variance=(1-p2)
by similar argument X|Y ollows a normal distribution with mean py and variance=(1-p2)
hence E[Y|X]=px and E[X|Y]=py
e) so it can be seen that E[X|Y]=py i.e, the value of X when Y is given is py. hence the correlation coefficient is the slope
