Can you describe a chisquare distribution and its 4 componen
Can you describe a chi-square distribution and it\'s 4 components? What is the Goodness of Fit test? Describe the steps used to do the Goodness of Fit test. When the Goodness of Fit test could be used in analyzing business.
Solution
The Chi-square test is intended to test how likely it is that an observed distribution is due to chance. It is also called a \"goodness of fit\" statistic, because it measures how well the observed distribution of data fits with the distribution that is expected if the variables are independent.
 
 A Chi-square test is designed to analyze categorical data. That means that the data has been counted and divided into categories. It will not work with parametric or continuous data (such as height in inches). For example, if you want to test whether attending class influences how students perform on an exam, using test scores (from 0-100) as data would not be appropriate for a Chi-square test. However, arranging students into the categories \"Pass\" and \"Fail\" would. Additionally, the data in a Chi-square grid should not be in the form of percentages, or anything other than frequency (count) data. Thus, by dividing a class of 54 into groups according to whether they attended class and whether they passed the exam, you might construct a data set like this:
 
 IMPORTANT: Be very careful when constructing your categories! A Chi-square test can tell you information based on how you divide up the data. However, it cannot tell you whether the categories you constructed are meaningful. For example, if you are working with data on groups of people, you can divide them into age groups (18-25, 26-40, 41-60...) or income level, but the Chi-square test will treat the divisions between those categories exactly the same as the divisions between male and female, or alive and dead! It\'s up to you to assess whether your categories make sense, and whether the difference (for example) between age 25 and age 26 is enough to make the categories 18-25 and 26-40 meaningful. This does not mean that categories based on age are a bad idea, but only that you need to be aware of the control you have over organizing data of that sort.
The following table would represent a possible input to the Chi-square test, using 2 variables to divide the data: gender and party affiliation. 2x2 grids like this one are often the basic example for the Chi-square test, but in actuality any size grid would work as well: 3x3, 4x2, etc.
 
 This shows the basic 2x2 grid. However, this is actually incomplete, in a sense; generally, the data table should include \"marginal\" information giving the total counts for each column and row, as well as for the whole data set:
 
 We now have a complete data set on the distribution of 100 individuals into categories of gender (Male/Female) and party affiliation (Democrat/Republican). A Chi-square test would allow you to test how likely it is that gender and party affiliation are completely independent; or in other words, how likely it is that the distribution of males and females in each party is due to chance.
 
 So, as implied, the null hypothesis in this case would be that gender and party affiliation are independent of one another. To test this hypothesis, we need to construct a model which estimates how the data should be distributed if our hypothesis of independence is correct. This is where the totals we put in the margins will become handy: later on, I\'ll show how you can calculate your estimated data using the marginals. Meanwhile, however, I\'ve constructed an example which will allow very easy calculations. Assuming that there\'s a 50/50 chance of males or females being in either party, we get the very simple distribution shown below.
 
 This is the information we would need to calculate the likelihood that gender and party affiliation are independent. I will discuss the next steps in calculating a Chi square value later, but for now I\'ll focus on the background information.
 
 Note: you can assume a different null hypothesis for a Chi-square test. Using the scenario suggested above, you could test the hypothesis that women are twice as likely to register as Democrats than men, and a Chi-square test would tell you how likely it is that the observed data reflects that relationship between your variables. In this case, you would simply run the test using a model of expected data built under the assumption that this hypothesis is true, and the formula will (as before) test how well that distribution fits the observed data. I will not discuss this in more detail, but it is important to know that the null hypothesis is not some abstract \"fact\" about the test, but rather a choice you make when calculating your model.
| Pass | Fail | |
| Attended | 25 | 6 | 
| Skipped | 8 | 15 | 


