Spam is of concern to anyone with an email address Several c
Spam is of concern to anyone with an e-mail address. Several companies offer protection by eliminating Spam e-mails as soon as they hit an inbox. To examine one such product, a manager randomly sampled his daily e-mails for 50 days after installing spam software. A total of 374 e-mails was received, of which 3 were spam. Use the Wilson estimator to estimate with 90% confidence the proportion of spam e-mails that get through.
Solution
The equation for the Wilson point estimate is:
[phat + (1 / 2*n)*z(1 – alpha/2)*z(1 – alpha/2)]
/ [1 + (1 / n)*z(1 – alpha/2)*z(1 – alpha/2)]
where phat is p estimated from the data, n is the total sample size, and z(1 – alpha/2) is the 1 – alpha/2 point of the Z distribution.
P = 3 / 374 = .00802
Z (.10) = 1.645
The point estimate is
[.00802 + (1 / 2*374)*1.645*1.645] / [1 + (1/374)*1.645*1.645]
= [.00802 + .00361] / [ 1.00724]
= .0115
The confidence interval is:
Numerator = .00802 + (1 / 2*374)*1.645^1.645 ± 1.645*SQRT{.00802*.99198/374 +
1.645^2/ 4*374*374} = .00802 + .00362 ± 1.645*SQRT {.0000213 + .0000048}
= .01164 ± SQRT{.0000261}
= .01164 ± .00511
Denominator = 1 + (1 / 374)*1.645^1.645 = 1 + .00724 = 1.00724
Wilson estimator upper limit = (.01164 + .00511) / 1.00724 = .01675 / 1.00724 = .01663
Wilson estimator lower limit = (.01164 – .00511) / 1.00724 = .00653 / 1.00724 = .00648
The 90% confidence interval is .00648 < true p < .01663
