Euclidean Correlation Jaccard Cosine For each of the followi
Euclidean, Correlation, Jaccard, Cosine. For each of the following vectors, x and y, calculate the indicated similarity or the distance measures. Show your work.
(a) x = (0 1 0 1 1), y = (1 0 1 0 0) Jaccard, Cosine, Euclidean, Correlation
(b) x= (0, -1, 0, 1), y = (1, 0, -1, 0) Cosine, Euclidean, Correlation
Solution
a)x=(0 1 0 1 1), y = (1 0 1 0 0)
Jacard:
Jaccard Coefficient=(f11)/(f01 + f10 + f11)
f01 = 1 the number of attributes where x was 0 and y was 1=2
f10 = 2 the number of attributes where x was 1 and y was 0=3
f00 = 5 the number of attributes where x was 0 and y was 0=0
f11 = 2 the number of attributes where x was 1 and y was 1=0
jacacard coefficient=0/(2+3+0)=0
cosine:
cosine(x,y)=(x.y)/(||x||*||y||)
x.y=0*1+1*0+0*1+1*0+1*0=0
hence cosine(x,y)=0
Eucledian:
Eucledian(x,y)=sqrt[(0-1)^2+(1-0)^2+(0-1)^2+(1-0)^2+(1-0)^2]
=sqrt[1+1+1+1+1]
=sqrt[5]
=2.236
Correlation:
correlation(x,y)=[covariance(x,y)]/[standard deviation(x)*standard deviation(y)]
mean(x)=(0+1+0+1+1)/5=3/5
mean(y)=(1+0+1+0+0)/5=2/5
covariance(x,y)=1/(5-1)*[(0-3/5)*(1-2/5)+(1-3/5)*(0-2/5)+(0-3/5)*(1-2/5)+(1-3/5)*(0-2/5)+(1-3/5)*(0-2/5)]
=1/4*[-9/25-9/25-9/25-9/25-9/25]
=1/4*[5*(-9/25)]
=1/4*(-9/5)
=-9/20
standard_deviation(x)=sqrt[((1/(5-1))*[(0-3/5)^2+(1-3/5)^2+(0-3/5)^2+(1-3/5)^2+(1-3/5)^2]]
=sqrt[(1/4)*(9/25+4/25+9/25+4/25+4/25)] =sqrt[(1/4)*(30/25)]
=sqrt[(1/4)*(6/5)]
=sqrt(6/20)
standard_deviation(y)=sqrt[((1/(5-1))*[(1-2/5)^2+(0-2/5)^2+(1-2/5)^2+(0-2/5)^2+(0-2/5)^2]]
=sqrt[(1/4)*(9/25+4/25+9/25+4/25+4/25)]
=sqrt(6/20)
correlation(x,y)=(-9/20)/[sqrt(6/20)*sqrt(6/20)]
=(-9/20)/(6/20)
=-9/6
=-3/2
=-1.5
b)x= (0, -1, 0, 1), y = (1, 0, -1, 0)
cosine:
cosine(x,y)=(x.y)/(||x||*||y||)
x.y=0*1+-1*0+0*-1+1*0=0
hence Cox(x,y)=0
Eucledian:
Eucledian(x,y)=sqrt((0-1)^2+(-1-0)^2+(0+1)^2+(1-0)^2)
=sqrt(1+1+1+1)
=sqrt(4)=2
correlation:
correlation(x,y)=[covariance(x,y)]/[standard deviation(x)*standard deviation(y)]
mean(x)=(0-1+0+1)/4=0
mean(y)=(1+0-1+0)/4=0
covariance(x,y)=1/(4-1)*[(0-0)*(1-0)+(0-0)*(-1-0)+(0-0)*(-1-0)+(1-0)*(0-0)
=0
therefore, correlation(x,y)=0

