kmeans算法的简单示例备课讲稿ppt课件.ppt
k-means算法的简单示例,Algorithm Procedure,Randomly select K points from complete samples as the initial center.(Thats what k means in K-means)Each point in the dataset is assigned to the closed cluster,based upon the Euclidean distance between each point and each cluster center.Each clusters center is recomputed as the average of the points in that cluster.Iterate step 2 or more until the new center of cluster equals to the original center of cluster or less than a specified threshold,then clustering finished.,3,A,B,C,D,E,F,I,G,J,H,Example,How to cluster A,B.H,J into two clusters?,4,A(1,4),B(2,4),C,D,E,F,I,G,J,H,Randomly choose A,B as the centre and K=2.,Example,0,1,1,1.41,2.24,3.61,4.47,5.39,4.24,5,1,0,1.41,1,2,2.83,3.61,4.47,3.61,4.24,So,we classify A,C as a cluster and B,E,D,F,G,H,I and J as another cluster.,Step 1 and 2.,A,B,C,D,E,F,G,H,I,J,means distance AB,5,A(1,4),B(2,4),C,D,E,F,I,G,J,H,Randomly choose A,B as the centre and K=2.,Example,Step 3.,The new centers of the two clusters are (1,4.5) and (3.75,2.875),6,cluster 1,cluster 2,new center,A,B,C,D,E,F,I,G,J,H,(1,4.5),(3.75,2.875),Example,0.5,1.12,0.5,1.12,1.8,3.91,4.72,5.59,4.61,5.32,2.97,2.08,3.48,2.75,3.58,0.91,1.53,2.41,1.89,2.25,Step 2 again., , as the centre and K=2.,So,we classify A,B,C,D,E as a cluster and F,G,H,I,J as another cluster.,A,B,C,D,E,F,G,H,I,J,7,A,B,C,D,E,F,I,G,J,H,(1,4.5),(3.75,2.875),Example,Step 3 again., , as the centre and K=2.,The new centers of the two clusters are P(1.6,4.8) and Q(4.8,1.6),8,cluster 2,cluster 1,new center,A,B,C,D,E,F,I,G,J,H,P(1.6,4.8),Q(4.8,1.6),Example,1,0.89,0.63,0.45,1.26,3.69,4.40,5.22,4.49,5.10,4.49,3.69,5.10,4.4,5.22,0.89,0.45,1.26,1,0.63,Step 2 again.,So,we classify A,B,C,D,E as a cluster and F,G,H,I,J as another cluster.,A,B,C,D,E,F,G,H,I,J,9,A,B,C,D,E,F,I,G,J,H,P(1.6,4.8),Q(4.8,1.6),Example,Step 3 again.,The new centers of the two clusters are equal to the original P(1.6,4.8) and Q(4.8,1.6),P , Q as the centre and K=2.,10,new center,cluster 2,cluster 1,Final,A,B,C,D,E,F,I,G,J,H,cluster 1,cluster 2,11,Clustering finished !,Disadvantages,one of the main disadvantages to k-means is the fact that you must specify the number of clusters(K) as an input to the algorithm.As designed,the algorithm is not capable of determining the appropriate number of clusters and depends upon the user to identify this in advance.,K=2,K=3,12,Thank you,此课件下载可自行编辑修改,仅供参考!感谢您的支持,我们努力做得更好!谢谢,