Suppose we have a normally distributed dataset with 50000 sc

Suppose we have a “normally distributed” dataset with 50,000 scores. We compute the mean and SD and find mean = 31, and SD = 2.5.

The following general statements are made:

1) Close to 6000 scores are in the low to mid 40’s

2) Close to 25000 scores are higher than 30

3) Close to 8000 scores are lower than 27

4) Close to 15000 scores are in the mid to upper 30’s

Is there a way to determine if the statements are true, false, likely true, likely false? Please explain!

HERE MEAN = 31

STD DEV = 2.5

1) WE WILL CHECK FOR P(X<40) =

For x = 40, the z-value z = (40 - 31) /2.5 = 3.6

Hence P(x < 40) = P(z < 3.6) = [area to the left of 3.6] = 0.9999

WHICH MEANS 99.99% OF 50000 WHICH IS 49000

HENCE FALSE.

2)P(X>30) =

For x = 30, z = (30 - 31) /2.5 = -0.4

Hence P(x > 30) = P(z > -0.4) = [total area] - [area to the left of -0.4]

= 1 -0.6554 = 0.3446 = 34.46% OF 50000

WHICH IS LESS THEN 25000

HENCE FALSE

For x = 27, the z-value z = (27 - 31) / 4 = -1

Hence P(x < 27) = P(z < -1) = [area to the left of -1] = 0.8413

= 84.13% OF 50000 HENCE > 40000

FALSE

For x = 30, the z-value z = (30 - 31) / 4 =-0.25

Hence P(x < 30) = P(z < -0.25) = [area to the left of -0.25] = 0.5938 = 59.38%

= NEAR BY 30000

HENCE FALSE.