please note i want the answer to be unique and not hand writ
please note i want the answer to be unique and not hand written
Q1. Give two examples, apart from those given in the slides, for each of the following: a) Data mining from the commercial viewpoint b) Data mining from the scientific viewpoint
Q2. Differentiate between classification of data and clustering of data with the help of suitable examples.
Solution
q1. Lots of data is being collected and warehoused
Web data,
e-commerce – purchases at department/ grocery stores
Bank/Credit Card transactions
Computers have become cheaper and more powerful
Competitive Pressure is Strong – Provide better, customized services for an edge (e.g. in Customer Relationship Management)
Data collected and stored at enormous speeds (GB/hour)
– remote sensors on a satellite
– telescopes scanning the skies
– microarrays generating gene expression data
– scientific simulations generating terabytes of data
Traditional techniques infeasible for raw data. Data mining may help scientists
– in classifying and segmenting data
– in Hypothesis Formation
q2.
Classification is the result of supervised learning which means that there is a known label that you want the system to generate. For example, if you built a fruit classifier, it would say “this is an orange, this is an apple”, based on you showing it examples of apples and oranges.
Clustering is the result of unsupervised learning which means that you’ve seen lots of examples, but don’t have labels. In this case, the clustering might return with “fruits with soft skin and lots of dimples”, “fruits with shiny hard skin” and “elongated yellow fruits” based not simply showing lots of fruit to the system, but not identifying the names of different types of fruit.
