Big Data and Information Analytics (BDIA)

A soft subspace clustering algorithm with log-transformed distances
Pages: 93 - 109, Issue 1, January 2016

doi:10.3934/bdia.2016.1.93      Abstract        References        Full text (387.2K)           Related Articles

Guojun Gan - Department of Mathematics, University of Connecticut, 196 Auditorium Rd, Storrs, CT 06269-3009, United States (email)
Kun Chen - Department of Statistics, University of Connecticut, 215 Glenbrook Road, Storrs, CT 06269, United States (email)

1 C. C. Aggarwal and C. K. Reddy (eds.), Data Clustering: Algorithms and Applications, CRC Press, Boca Raton, FL, USA, 2014.       
2 R. Agrawal, J. Gehrke, D. Gunopulos and P. Raghavan, Automatic subspace clustering of high dimensional data for data mining applications, in SIGMOD Record ACM Special Interest Group on Management of Data, ACM Press, New York, NY, USA, 27 (1998), 94-105.
3 S. Boutemedjet, D. Ziou and N. Bouguila, Model-based subspace clustering of non-gaussian data, Neurocomputing, 73 (2010), 1730-1739.
4 A. Broder, L. Garcia-Pueyo, V. Josifovski, S. Vassilvitskii and S. Venkatesan, Scalable k-means by ranked retrieval, in Proceedings of the 7th ACM International Conference on Web Search and Data Mining, WSDM '14, ACM, 2014, 233-242.
5 F. Cao, J. Liang and G. Jiang, An initialization method for the $k$-means algorithm using neighborhood model, Computers & Mathematics with Applications, 58 (2009), 474-483.       
6 M. E. Celebi, H. A. Kingravi and P. A. Vela, A comparative study of efficient initialization methods for the k-means clustering algorithm, Expert Systems with Applications, 40 (2013), 200-210.
7 X. Chen, Y. Ye, X. Xu and J. Z. Huang, A feature group weighting method for subspace clustering of high-dimensional data, Pattern Recognition, 45 (2012), 434-446.
8 M. de Souto, I. Costa, D. de Araujo, T. Ludermir and A. Schliep, Clustering cancer gene expression data: A comparative study, BMC Bioinformatics, 9 (2008), 497-510.
9 C. Domeniconi, D. Gunopulos, S. Ma, B. Yan, M. Al-Razgan and D. Papadopoulos, Locally adaptive metrics for clustering high dimensional data, Data Mining and Knowledge Discovery, 14 (2007), 63-97.       
10 B. Duran and P. Odell, Cluster Analysis - A survey, vol. 100 of Lecture Notes in Economics and Mathematical Systems, Springer-Verlage, Berlin, Heidelberg, New York, 1974.       
11 E. Elhamifar and R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35 (2013), 2765-2781.
12 G. Gan, Data Clustering in C++: An Object-Oriented Approach, Data Mining and Knowledge Discovery Series, Chapman & Hall/CRC Press, Boca Raton, FL, USA, 2011.
13 G. Gan and M. K.-P. Ng, Subspace clustering using affinity propagation, Pattern Recognition, 48 (2015), 1455-1464.
14 G. Gan and M. K.-P. Ng, Subspace clustering with automatic feature grouping, Pattern Recognition, 48 (2015), 3703-3713.
15 G. Gan and J. Wu, Subspace clustering for high dimensional categorical data, ACM SIGKDD Explorations Newsletter, 6 (2004), 87-94.
16 G. Gan and J. Wu, A convergence theorem for the fuzzy subspace clustering (FSC) algorithm, Pattern Recognition, 41 (2008), 1939-1947.
17 G. Gan, J. Wu and Z. Yang, A fuzzy subspace algorithm for clustering high dimensional data, in Lecture Notes in Artificial Intelligence (eds. X. Li, S. Wang and Z. Dong), vol. 4093, Springer-Verlag, 2006, 271-278.
18 J. A. Hartigan, Clustering Algorithms, Wiley, New York, NY, 1975.       
19 J. Huang, M. Ng, H. Rong and Z. Li, Automated variable weighting in $k$-means type clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27 (2005), 657-668.
20 A. K. Jain and R. C. Dubes, Algorithms for Clustering Data, Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1988.       
21 L. Jing, M. Ng and J. Huang, An entropy weighting $k$-means algorithm for subspace clustering of high-dimensional sparse data, IEEE Transactions on Knowledge and Data Engineering, 19 (2007), 1026-1041.
22 H.-P. Kriegel, P. Kröger and A. Zimek, Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering, ACM Transactions on Knowledge Discovery from Data, 3 (2009), 1-58.
23 J. Macqueen, Some methods for classification and analysis of multivariate observations, in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics andProbability (eds. L. LeCam and J. Neyman), vol. 1, University of California Press, Berkely, CA, USA, 1967, 281-297.       
24 J. Peña, J. Lozano and P. Larrañaga, An empirical comparison of four initialization methods for the $k$-means algorithm, Pattern Recognition Letters, 20 (1999), 1027-1040.
25 L. Peng and J. Zhang, An entropy weighting mixture model for subspace clustering of high-dimensional data, Pattern Recognition Letters, 32 (2011), 1154-1161.
26 R. Xu and D. Wunsch, Clustering, Wiley-IEEE Press, Hoboken, NJ, 2008.

Go to top