Data Mining
complete the table
if you dont know the answer please dont wast my paid question
| Algorithm | shapes of clusters that can be determined | input parameters that must be specified | limitations |
| BIRCH | | | |
| DBSCAN | | | |
| CHAMELEON | | | |
| k-means | | | |
| k-medoids | | | |
| CLARA | | | |
Algorithm
shapes of clusters that can be determined
input parameters that must be specified
limitations
BIRCH
DBSCAN
CHAMELEON
(or)
k-means
k-medoids
CLARA
| Algorithm | shapes of clusters that can be determined | input parameters that must be specified | limitations |
| BIRCH | - Better suited to find spherical clusters
| - N d-dimensional data points
| - Because a CF tree can hold only a limited number of entries due to its size, a CF tree does not always correspond to what a user may consider a natural cluster.
- data order sensitivity and inability to deal with non-spherical clusters of varying size because it uses the concept of diameter to control the boundary of a cluster
- Handles only numeric data, and sensitive to the order of the data Record
|
| DBSCAN | - To identify clusters of any shape in data set (or)
- discover clusters of arbitrary shapes
| - Maximum possible distance for a point to be considered density-reachable and minimum number of points in a cluster
| - Quadratic time in the worst case
- fails to identify clusters if density varies and if the data set is too sparse
- difficulties in high dimensional spaces
|
| CHAMELEON | - discovering arbitrary-shaped clusters of varying density
(or) | - N d-dimensional categorical points
| - Quadratic time in the worst case
|
| k-means | - finding spherical-shaped clusters (or) convex clusters
| - The number of clusters must be specify (K)
| - Sensitive to noise and outliers. Works well on small data sets only
- Sensitive to noisy and outlier.
- K-Means cannot handle non-globular data of different sizes and densities
- K-Means will not identify outliers
|
| k-medoids | - finding spherical-shaped clusters( or) convex clusters
| - The number of clusters must be specify .it allow if presence of noisy and outlier
| - Small data sets (not scalable)
- Processing more costly than k-mean.large data set cannot handle
|
| CLARA | - finding spherical-shaped clusters
| - The number of clusters must be specify
| - Sensitive to the selection of initial samples
- Fixed sample at each stage .
- Does not find best cluster
|