An Initialization Method for Clustering High-Dimensional Data
- 软件学院－会议论文 
In iterative refinement clustering algorithms, such as the various types of K-Means algorithms, the clustering results are very sensitive to the initial cluster centers. Conventional initialization methods tend to loss effectiveness due to the so-called "curse of dimensionality" when clustering high-dimensional data. In this paper, a local density based method is proposed to search for initial cluster centers on high-dimensional data. We define the probability density of a point as the amount of its highly similar neighborhoods with weight coefficient. Points with high density neighborhoods and low similarity are chosen as the initial cluster centers. Experimental results on real world datasets show the effectiveness of the proposed method.