Please Help Naive Bayes Classifier statistics and Biology II
Please Help! Naive Bayes Classifier; statistics and Biology!
III. Naive Bayes Classifier: There are two well-known marker genes for breast cancer: BRCA2, ERBB2. Canadian Disease Control Center carried out a clinical trial to check the gene expression levels for these two genes in the subjects diagnosed with breast cancer: one group of 50 randomly recruited subjects has no metastasis; and the other group of 50 subjects has metastasis. After normalization, BRCA2 expression levels follow a normal distribution .N(0, 1) for non-metastatic subjects while N(1, 1) for metastatic subjects. For ERBB2, the corresponding expression levels in non-metastatic subjects and metastic subjects follow normal distributions N(0, 1) and N(-1, 1), respectively. What is the basic assumption of naive Bayes classifier? Under what situations, it may be problematic?Solution
The basic assumption of the naive bayes classifier is the independence of the predictor variables. The independence assumption is not a suitable one for realistic models. Because in real life the predictor variables may be correlated (in fact that is the case most of the time),then the assumption of independence fails.
