Discuss one or two examples of how you think data mining tec
Discuss one or two examples of how you think data mining techniques can be used in healthcare.
How do you think data mining differs from statistical analysis? How are they similar?
Solution
USEFUL OF DATA MINING IN INDUSTRIES
Many industries successfully use data mining. It helps the retail industry model customer response. It helps banks predict customer profitability. It serves similar use cases in telecom, manufacturing, the automotive industry, higher education, life sciences, and more.
data mining in healthcare today remains, for the most part, an academic exercise with only a few pragmatic success stories. Academicians are using data-mining approaches like decision trees, clusters, neural networks, and time series to publish research. Healthcare, however, has always been slow to incorporate the latest research into everyday practice
Healthcare industry today generates large amounts of complex data about patients, hospitals resources, disease diagnosis, electronic patient records, medical devices etc. • The large amounts of data is a key resource to be processed and analyzed for knowledge extraction that enables support for cost-savings and decision making.
Data mining
• brings a set of tools and techniques that can be applied to this processed data to discover hidden patterns
• that provide healthcare professionals an additional source of knowledge for making decisions
• The decisions rests with health care professionals.
EXAMPLE
One client is a health system trying to succeed in risk-based contracts while still performing well under the fee-for-service reimbursement model. The transition to value-based purchasing is a slow one. Until the flip is switched all the way, health systems have to design processes that enable them to straddle both models. This client is using data mining to lower its census for patients under risk contracts, while at the same time keeping its patient volume steady for patients not included in these contracts. We are mining the data to predict what the volumes will be for each category of patient. Then, the health system develops processes to make sure these patients receive the appropriate care at the right place and at the right time. This would include care management outreach for high-risk patients
Data mining is used successfully and extensively in healthcare today. For example, I was part of a project that mined healthcare claims to determine best providers and procedures for conditions, diagnostic aids for certain procedures and protein analysis for drug development.
How do you think data mining differs from statistical analysis?
Data mining was a largely commercial concern and driven by business needs (coupled with the \"need\" for vendors to sell software and hardware systems to businesses). One thing Friedman noted was that all the \"features\" being hyped originated outside of statistics -- from algorithms and methods like neural nets to GUI driven data analysis -- and none of the traditional statistical offerings seemed to be a part of any of these systems (regression, hypothesis testing, etc). \"Our core methodology has largely been ignored.\" It was also sold as user driven along the lines of what you noted: here\'s my data, here\'s my \"business question\", give me an answer.
I think Friedman was trying to provoke. He didn\'t think data mining had serious intellectual underpinnings where methodology was concerned, but that this would change and statisticians ought to play a part rather than ignoring it.
My own impression is that this has more or less happened. The lines have been blurred. Statisticians now publish in data mining journals. Data miners these days seem to have some sort of statistical training. While data mining packages still don\'t hype generalized linear models, logistic regression is well known among the analysts -- in addition to clustering and neural nets. Optimal experimental design may not be part of the data mining core, but the software can be coaxed to spit out p-values.
How are they similar?
they are use to take decision from a analysis of statistics

