American Institute of Mathematical Sciences

doi: 10.3934/jimo.2019029

Supervised distance preserving projection using alternating direction method of multipliers

 School of Mathematics, University of Dhaka, Bangladesh, School of Mathematics, University of Southampton, UK

Received  February 2018 Revised  November 2018 Published  May 2019

Fund Project: The research of the author was supported by the Commonwealth Scholarship Commission, UK BDCS-2012-44

Supervised Distance Preserving Projection (SDPP) is a dimension reduction method in supervised setting proposed recently by Zhu et. al in [43]. The method learns a linear mapping from the input space to the reduced feature space. While the method showed very promising result in regression task, for classification problems the performance is not satisfactory. The preservation of distance relation with neighborhood points forces data to project very close to one another in the projected space irrespective of their classes which ends up with low classification rate. To avoid the crowdedness of SDPP approach we have proposed a modification of SDPP which deals both regression and classification problems and significantly improves the performance of SDPP. We have incorporated the total variance of the projected co-variates to the SDPP problem which is maximized to preserve the global structure. This approach not only facilitates efficient regression like SDPP but also successfully classifies data into different classes. We have formulated the proposed optimization problem as a Semidefinite Least Square (SLS) SDPP problem. A two block Alternating Direction Method of Multipliers have been developed to learn the transformation matrix solving the SLS-SDPP which can easily handle out of sample data.

Citation: Sohana Jahan. Supervised distance preserving projection using alternating direction method of multipliers. Journal of Industrial & Management Optimization, doi: 10.3934/jimo.2019029
References:
 [1] E. Barshan, A. Ghodsi, Z. Azimifar and M. Z. Jahromi, Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds, Pattern Recognit, 44 (2010), 1357-1371. doi: 10.1016/j.patcog.2010.12.015. Google Scholar [2] J. Borwein and A. S. Lewis, Convex Analysis and Non Linear Optimization: Theory and Examples, Springer, New York, 2006. doi: 10.1007/978-0-387-31256-9. Google Scholar [3] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Machine Learning, 3 (2010), 1-122. Google Scholar [4] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004. doi: 10.1017/CBO9780511804441. Google Scholar [5] M. R. Brito, E. L. Chávez, A. J. Quiroz and J. E. Yukich, Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection, Stat. Probabil. Lett., 35 (1997), 33-42. doi: 10.1016/S0167-7152(96)00213-1. Google Scholar [6] I. Cheng Yeh, Modeling of strength of high performance concrete using artificial neural networks, Cement and Concrete Research, 28 (1998), 1797-1808. doi: 10.1016/S0008-8846(98)00165-3. Google Scholar [7] F. Coronaa, Z. Zhu, A. H. d. Souza Jr, M. Mulasd, E. Muruf, L. Sassuf, G. Barretob and R. Baratti, Supervised Distance Preserving Projections: Applications in the quantitative analysis of diesel fuels and light cycle oils from NIR spectra, Journal of Process Control, 30 (2015), 10-21. doi: 10.1016/j.jprocont.2014.11.005. Google Scholar [8] J. Eckstein and D. P. Bertsekas, On the douglas-rachford splitting method and the proximal point algorithm for maximal monotone operators, Mathematical Programming, 55 (1992), 293-318. doi: 10.1007/BF01581204. Google Scholar [9] J. Eckstein and M. Fukushima, Some reformulations and applications of the alternating direction method of multipliers, Large Scale Optimization: State of the Art, (1993), 115–134. Google Scholar [10] J. Eckstein and W. Yao, Understanding the convergence of Alternating Direction Method of Multipliers, Theoritical and Computational Perspectives, RUTCOR Research Report, 2014.Google Scholar [11] R. A. Fisher, The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7 (1936), 179-188. doi: 10.1111/j.1469-1809.1936.tb02137.x. Google Scholar [12] M. Fortin and R. Glowinski, Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, North-Holland Publishing Co., Amsterdam, 1983. Google Scholar [13] M. Fortin and R. Glowinski, On Decomposition-Coordination Methods Using an Augmented Lagrangian, Augmented Lagrangian Methods: Applications to the Solution of Boundary-Value Problems, North-Holland: Amsterdam, 1983. Google Scholar [14] K. Fukumizu, F. R. Bach and M. Jordan, Kernel dimension reduction in regression, Annals of Statistics, 37 (2009), 1871-1905. doi: 10.1214/08-AOS637. Google Scholar [15] D. Gabay, Applications of the method of multipliers to variational inequalities, in Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, studies in Mathematics, 15 (1983), 299-331. Google Scholar [16] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinearvariational problems via Finite element approximation, Computers and Mathematics with Applications, 2 (1976), 17-40. Google Scholar [17] R. Glowinski, Lectures on Numerical Methods for Nonlinear Variational Problem, Tata Institute of Fundamental Research Lectures on Mathematics and Physics, Tata Institute of Fundamental Research, Bombay, Notes by M. G. Vijayasundaram and M. Adimurthi, 1980. Google Scholar [18] R. Glowinski and A. Marrocco, Sur l'approximation, par $\acute{e}l\acute{e}ments$ finis d'ordre un, et la $r\acute{e}$solution, par $p\acute{e}$nalisation-dualit$\acute{e}$, d'une classe de probl$\grave{e}$mes de dirichlet non lin$\acute{e}$ares, Revue Francaise d'Automatique, Informatique et Recherche Op$\acute{e}$rationelle, 9 (1975), 41–76. Google Scholar [19] R. Glowinski and P. L. Tallec, Augmented Lagrangian Methods for the Solution of Variational Problems, Studies in Applied and Numerical Mathematics, 1989. doi: 10.1137/1.9781611970838.ch3. Google Scholar [20] J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, Springer-Verlag, Berlin, 2011. doi: 10.1007/978-3-642-19721-5. Google Scholar [21] B. He, H. Yang and S. Wang, Alternating direction method with self-adaptive penalty parameters for monotone variational inequalities, Journal of Optimization Theory and Applications, 106 (2000), 337–356. doi: 10.1023/A:1004603514434. Google Scholar [22] S. Jahan and H. D. Qi, Regularized Multidimensional Scaling with Radial Basis Functions, Journal of Industrial and Management Optimization, 12 (2016), 543-563. doi: 10.3934/jimo.2016.12.543. Google Scholar [23] K. Jiang, D. Sun and K.-C. Toh, Solving nuclear norm regularized and semidefinite matrix least square problems with linear equality constraints, Discrete Geometry and Optimization, 69 (2013), 133-162. doi: 10.1007/978-3-319-00200-2_9. Google Scholar [24] J. Lee and M. Verleysen, Nonlinear Dimensionality Reduction, Springer, New York, 2007. doi: 10.1007/978-0-387-39351-3. Google Scholar [25] K. Li, Sliced inverse regression for dimension reduction, J Am Stat Assoc, 86 (1991), 316-342. doi: 10.1080/01621459.1991.10475035. Google Scholar [26] X. Li, D. Sun and K.-C. Toh, A Schur complement based semi-proximal ADMM for convex quadratic conic programming and extensions,, Math. Program., 155 (2016), Ser. A, 333–373. doi: 10.1007/s10107-014-0850-5. Google Scholar [27] L. J. P. Maaten, E. Postma and H. V. D. Herik, Dimensionality Reduction: A Comparative Review, Technical Report TiCC-TR 2009–005, Tilburg University Technical, Tilburg, 2009.Google Scholar [28] S. Mika, G. Ratsch, J. Weston, B. Scholkopf and K. Mullers, Fisher discriminant analysis with kernels, Neural Networks for Signal Processing IX, Proceedings of the IEEE Signal Processing Society Workshop, IEEE, Piscataway, (2002), 41–48. doi: 10.1109/NNSP.1999.788121. Google Scholar [29] H.-D. Qi and D. Sun, An augmented Lagrangian dual approach for the H-weighted nearest correlation matrix problem, IMA Journal of Numerical Analysis, 31 (2011), 491-511. doi: 10.1093/imanum/drp031. Google Scholar [30] H.-D. Qi, N. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826. doi: 10.1109/TSP.2013.2264814. Google Scholar [31] R. T. Rockafellar, Augmented lagrangians and applications of the proximal point algorithm in convex programming, Mathematics of Operations Research, 1 (1976), 97-116. doi: 10.1287/moor.1.2.97. Google Scholar [32] S. Roweis and L. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, 290 (2000), 2323-2326. Google Scholar [33] B. Schölkopf, A. Smola and K. Müller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation, 10 (1998), 1299-1319. Google Scholar [34] D. Sun, K.-C. Toh and L. Yang, A convergent 3-block semi-proximal alternating direction method of multipliers for conic programming with 4-type of constraints, SIAM Journal on Optimization, 25 (2015), 882-915. doi: 10.1137/140964357. Google Scholar [35] J. Tenenbaum, V. Silva and J. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319-2323. Google Scholar [36] S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB approach, Elsevier Inc., 2010.Google Scholar [37] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009.Google Scholar [38] A. Tsanas, M. A. Little, P. E. McSharry and L. O. Ramig, Accurate telemonitoring of Parkinson.s disease progression by non-invasive speech tests, IEEE Transactions on Biomedical Engineering, 57 (2010), 884-893. doi: 10.1109/TBME.2009.2036000. Google Scholar [39] J. Venna and S. Kaski, Comparison of visualization methods for an atlas of gene expression data sets, Inf Vis, 6 (2007), 139-154. doi: 10.1057/palgrave.ivs.9500153. Google Scholar [40] H. Wold, Soft modeling by latent variables: The nonlinear iterative partial least squares approach, Perspectives in Probability and Statistics, Papers in Honour of MS Bartlett, 1975,117–142. doi: 10.1017/s0021900200047604. Google Scholar [41] H. Wold, Partial Least Squares, Encyclopedia of Statistical Sciences, 2009.Google Scholar [42] Y. Yeh Y, S. Huang and Y. Lee, Nonlinear dimension reduction with kernel sliced inverse regression, IEEE Trans Knowl Data Eng, 21 (2009), 1590-1603. Google Scholar [43] Z. Zhu, T. Simil$\ddot{a}$ and F. Corona, Supervised distance preserving projection, Neural Processing Letters, 38 (2013), 445-463. doi: 10.1007/s11063-013-9285-x. Google Scholar

show all references

References:
 [1] E. Barshan, A. Ghodsi, Z. Azimifar and M. Z. Jahromi, Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds, Pattern Recognit, 44 (2010), 1357-1371. doi: 10.1016/j.patcog.2010.12.015. Google Scholar [2] J. Borwein and A. S. Lewis, Convex Analysis and Non Linear Optimization: Theory and Examples, Springer, New York, 2006. doi: 10.1007/978-0-387-31256-9. Google Scholar [3] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Machine Learning, 3 (2010), 1-122. Google Scholar [4] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004. doi: 10.1017/CBO9780511804441. Google Scholar [5] M. R. Brito, E. L. Chávez, A. J. Quiroz and J. E. Yukich, Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection, Stat. Probabil. Lett., 35 (1997), 33-42. doi: 10.1016/S0167-7152(96)00213-1. Google Scholar [6] I. Cheng Yeh, Modeling of strength of high performance concrete using artificial neural networks, Cement and Concrete Research, 28 (1998), 1797-1808. doi: 10.1016/S0008-8846(98)00165-3. Google Scholar [7] F. Coronaa, Z. Zhu, A. H. d. Souza Jr, M. Mulasd, E. Muruf, L. Sassuf, G. Barretob and R. Baratti, Supervised Distance Preserving Projections: Applications in the quantitative analysis of diesel fuels and light cycle oils from NIR spectra, Journal of Process Control, 30 (2015), 10-21. doi: 10.1016/j.jprocont.2014.11.005. Google Scholar [8] J. Eckstein and D. P. Bertsekas, On the douglas-rachford splitting method and the proximal point algorithm for maximal monotone operators, Mathematical Programming, 55 (1992), 293-318. doi: 10.1007/BF01581204. Google Scholar [9] J. Eckstein and M. Fukushima, Some reformulations and applications of the alternating direction method of multipliers, Large Scale Optimization: State of the Art, (1993), 115–134. Google Scholar [10] J. Eckstein and W. Yao, Understanding the convergence of Alternating Direction Method of Multipliers, Theoritical and Computational Perspectives, RUTCOR Research Report, 2014.Google Scholar [11] R. A. Fisher, The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7 (1936), 179-188. doi: 10.1111/j.1469-1809.1936.tb02137.x. Google Scholar [12] M. Fortin and R. Glowinski, Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, North-Holland Publishing Co., Amsterdam, 1983. Google Scholar [13] M. Fortin and R. Glowinski, On Decomposition-Coordination Methods Using an Augmented Lagrangian, Augmented Lagrangian Methods: Applications to the Solution of Boundary-Value Problems, North-Holland: Amsterdam, 1983. Google Scholar [14] K. Fukumizu, F. R. Bach and M. Jordan, Kernel dimension reduction in regression, Annals of Statistics, 37 (2009), 1871-1905. doi: 10.1214/08-AOS637. Google Scholar [15] D. Gabay, Applications of the method of multipliers to variational inequalities, in Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, studies in Mathematics, 15 (1983), 299-331. Google Scholar [16] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinearvariational problems via Finite element approximation, Computers and Mathematics with Applications, 2 (1976), 17-40. Google Scholar [17] R. Glowinski, Lectures on Numerical Methods for Nonlinear Variational Problem, Tata Institute of Fundamental Research Lectures on Mathematics and Physics, Tata Institute of Fundamental Research, Bombay, Notes by M. G. Vijayasundaram and M. Adimurthi, 1980. Google Scholar [18] R. Glowinski and A. Marrocco, Sur l'approximation, par $\acute{e}l\acute{e}ments$ finis d'ordre un, et la $r\acute{e}$solution, par $p\acute{e}$nalisation-dualit$\acute{e}$, d'une classe de probl$\grave{e}$mes de dirichlet non lin$\acute{e}$ares, Revue Francaise d'Automatique, Informatique et Recherche Op$\acute{e}$rationelle, 9 (1975), 41–76. Google Scholar [19] R. Glowinski and P. L. Tallec, Augmented Lagrangian Methods for the Solution of Variational Problems, Studies in Applied and Numerical Mathematics, 1989. doi: 10.1137/1.9781611970838.ch3. Google Scholar [20] J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, Springer-Verlag, Berlin, 2011. doi: 10.1007/978-3-642-19721-5. Google Scholar [21] B. He, H. Yang and S. Wang, Alternating direction method with self-adaptive penalty parameters for monotone variational inequalities, Journal of Optimization Theory and Applications, 106 (2000), 337–356. doi: 10.1023/A:1004603514434. Google Scholar [22] S. Jahan and H. D. Qi, Regularized Multidimensional Scaling with Radial Basis Functions, Journal of Industrial and Management Optimization, 12 (2016), 543-563. doi: 10.3934/jimo.2016.12.543. Google Scholar [23] K. Jiang, D. Sun and K.-C. Toh, Solving nuclear norm regularized and semidefinite matrix least square problems with linear equality constraints, Discrete Geometry and Optimization, 69 (2013), 133-162. doi: 10.1007/978-3-319-00200-2_9. Google Scholar [24] J. Lee and M. Verleysen, Nonlinear Dimensionality Reduction, Springer, New York, 2007. doi: 10.1007/978-0-387-39351-3. Google Scholar [25] K. Li, Sliced inverse regression for dimension reduction, J Am Stat Assoc, 86 (1991), 316-342. doi: 10.1080/01621459.1991.10475035. Google Scholar [26] X. Li, D. Sun and K.-C. Toh, A Schur complement based semi-proximal ADMM for convex quadratic conic programming and extensions,, Math. Program., 155 (2016), Ser. A, 333–373. doi: 10.1007/s10107-014-0850-5. Google Scholar [27] L. J. P. Maaten, E. Postma and H. V. D. Herik, Dimensionality Reduction: A Comparative Review, Technical Report TiCC-TR 2009–005, Tilburg University Technical, Tilburg, 2009.Google Scholar [28] S. Mika, G. Ratsch, J. Weston, B. Scholkopf and K. Mullers, Fisher discriminant analysis with kernels, Neural Networks for Signal Processing IX, Proceedings of the IEEE Signal Processing Society Workshop, IEEE, Piscataway, (2002), 41–48. doi: 10.1109/NNSP.1999.788121. Google Scholar [29] H.-D. Qi and D. Sun, An augmented Lagrangian dual approach for the H-weighted nearest correlation matrix problem, IMA Journal of Numerical Analysis, 31 (2011), 491-511. doi: 10.1093/imanum/drp031. Google Scholar [30] H.-D. Qi, N. H. Xiu and X. M. Yuan, A Lagrangian dual approach to the single source localization problem, IEEE Transactions on Signal Processing, 61 (2013), 3815-3826. doi: 10.1109/TSP.2013.2264814. Google Scholar [31] R. T. Rockafellar, Augmented lagrangians and applications of the proximal point algorithm in convex programming, Mathematics of Operations Research, 1 (1976), 97-116. doi: 10.1287/moor.1.2.97. Google Scholar [32] S. Roweis and L. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, 290 (2000), 2323-2326. Google Scholar [33] B. Schölkopf, A. Smola and K. Müller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation, 10 (1998), 1299-1319. Google Scholar [34] D. Sun, K.-C. Toh and L. Yang, A convergent 3-block semi-proximal alternating direction method of multipliers for conic programming with 4-type of constraints, SIAM Journal on Optimization, 25 (2015), 882-915. doi: 10.1137/140964357. Google Scholar [35] J. Tenenbaum, V. Silva and J. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319-2323. Google Scholar [36] S. Theodoridis and K. Koutroumbas, An Introduction to Pattern Recognition, A MATLAB approach, Elsevier Inc., 2010.Google Scholar [37] S. Theodoridis and K. Koutroumbas, Pattern Recognition, Elsevier Inc., 2009.Google Scholar [38] A. Tsanas, M. A. Little, P. E. McSharry and L. O. Ramig, Accurate telemonitoring of Parkinson.s disease progression by non-invasive speech tests, IEEE Transactions on Biomedical Engineering, 57 (2010), 884-893. doi: 10.1109/TBME.2009.2036000. Google Scholar [39] J. Venna and S. Kaski, Comparison of visualization methods for an atlas of gene expression data sets, Inf Vis, 6 (2007), 139-154. doi: 10.1057/palgrave.ivs.9500153. Google Scholar [40] H. Wold, Soft modeling by latent variables: The nonlinear iterative partial least squares approach, Perspectives in Probability and Statistics, Papers in Honour of MS Bartlett, 1975,117–142. doi: 10.1017/s0021900200047604. Google Scholar [41] H. Wold, Partial Least Squares, Encyclopedia of Statistical Sciences, 2009.Google Scholar [42] Y. Yeh Y, S. Huang and Y. Lee, Nonlinear dimension reduction with kernel sliced inverse regression, IEEE Trans Knowl Data Eng, 21 (2009), 1590-1603. Google Scholar [43] Z. Zhu, T. Simil$\ddot{a}$ and F. Corona, Supervised distance preserving projection, Neural Processing Letters, 38 (2013), 445-463. doi: 10.1007/s11063-013-9285-x. Google Scholar
(a) SDPP: Solid lines indicate connection between neighbors. (b-c) Preservation scheme of the local geometry by SDPP
Average Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) with error bars for prediction of test set of Parkinsons Telemonitoring Data Set obtained by SLS-SDPP, SDPP, PLS, SPCA and KDR. The bar diagram represents almost same performance for all the methods in terms of RMSE. In terms of MAE, SLS-SDPP outperforms all other methods
Continuity measure with respect to different $k$ and $k_r$ for (a) Parkinsons Telemonitoring Data: Figure suggests to choose the neighborhood size $k = 8$ since highest continuity measure is obtained at $k = 8$ (b) Concrete Compressive Strength Data: Highest continuity measure is obtained at $k = 10$ therefore $k = 10$ is chosen as the neighborhood size
Average RMSE and MAE for test data prediction of Concrete Compressive Strength Data set along different dimension obtained by SLS-SDPP, SDPP, PLS, SPCA and KDR. Best performance achieved by SLS-SDPP at D = 5. The small error bar implies the stability of SLS-SDPP method regardless of training data
Classification error rates for different projection dimension computed by algorithm SLS-SDPP, SDPP, SPCA, KDR and FDA. (a)CTG data (b) Seismic bump data (c)Diabetic Retinopathy data (d) Mushroom data. Figures illustrate that best performance is achieved by SLS-SDPP. The error rate for this method remained consistently lower then other methods
List of datasets used in this article and their sources :
 Dataset Dim Class no. of ins. Source Classification Seismic bump 19 2 2584 UCI Repository Cardiotocography 21 3 2126 UCI Repository Diabetic Retinopathy 19 2 1115 UCI Repository Mushroom 22 2 8124 UCI Repository Regression Parkinson's Telemonitoring 16 - 5875 UCI Repository Concrete Compressive Strength 8 - 1030 UCI Repository
 Dataset Dim Class no. of ins. Source Classification Seismic bump 19 2 2584 UCI Repository Cardiotocography 21 3 2126 UCI Repository Diabetic Retinopathy 19 2 1115 UCI Repository Mushroom 22 2 8124 UCI Repository Regression Parkinson's Telemonitoring 16 - 5875 UCI Repository Concrete Compressive Strength 8 - 1030 UCI Repository
Average RMSE and MAE for test set prediction of Parkinson Telemonitoring dataset
 Method RMSE (mean$\pm$std) MAE (mean$\pm$std) SLS-SDPP 10.6781$\pm$1.1481 8.3503$\pm$0.8525 SDPP 10.7934$\pm$1.2371 8.7459$\pm$0.8378 PLS 10.8133$\pm$1.2806 8.7822$\pm$0.8883 SPCA 10.8006$\pm$ 1.2449 8.7714 $\pm$0.8555 KDR 10.8478$\pm$ 1.3032 8.8008$\pm$0.9139
 Method RMSE (mean$\pm$std) MAE (mean$\pm$std) SLS-SDPP 10.6781$\pm$1.1481 8.3503$\pm$0.8525 SDPP 10.7934$\pm$1.2371 8.7459$\pm$0.8378 PLS 10.8133$\pm$1.2806 8.7822$\pm$0.8883 SPCA 10.8006$\pm$ 1.2449 8.7714 $\pm$0.8555 KDR 10.8478$\pm$ 1.3032 8.8008$\pm$0.9139
Average RMSE and MAE (mean$\pm$std) for the test set prediction on Concrete Compressive Strength Data Set
 Error Dim SLS-SDPP SDPP PLS SPCA KDR RMSE 1 10.4649$\pm$1.2072 10.5241$\pm$1.4220 12.7666$\pm$2.1539 12.8090$\pm$2.0429 13.7423$\pm$2.7046 2 10.4540$\pm$1.1356 10.4075$\pm$1.5903 11.8629$\pm$1.1837 12.8712$\pm$1.9074 11.6399$\pm$2.6641 3 10.4629$\pm$1.1648 10.4079$\pm$1.5915 10.9379$\pm$1.1096 12.8370$\pm$1.9316 11.5163$\pm$2.2178 4 10.4450$\pm$1.2848 10.5890$\pm$1.3303 10.5108$\pm$1.1943 12.8674$\pm$1.8353 11.0320$\pm$1.7599 5 10.2932$\pm$0.7866 10.7247$\pm$1.0038 10.4432$\pm$1.1883 12.9647$\pm$1.5969 10.2933$\pm$1.6597 6 10.9910$\pm$0.7200 10.4893$\pm$0.7228 10.4359$\pm$1.1480 10.4770$\pm$1.1142 10.4473$\pm$1.1485 7 10.4495$\pm$0.7052 10.5313$\pm$0.7228 10.4480$\pm$1.0746 10.4520$\pm$1.0604 10.4514$\pm$1.0952 8 10.4349$\pm$1.1134 10.4342$\pm$1.0034 10.4342$\pm$1.0034 10.4342$\pm$1.0034 16.3019$\pm$1.8078 MAE 1 8.3879$\pm$1.0882 8.3687$\pm$1.0994 10.3995$\pm$1.7948 10.4451$\pm$1.6868 10.8884$\pm$2.3565 2 8.2339$\pm$1.2642 8.2338$\pm$1.2918 9.3977$\pm$0.8294 10.4484$\pm$1.5463 9.2285$\pm$2.3954 3 8.2288$\pm$1.3385 8.2342$\pm$1.3380 8.3771$\pm$0.6452 10.4125$\pm$1.5656 8.9818$\pm$1.6882 4 8.5898$\pm$1.1262 8.3935$\pm$1.1461 8.2199$\pm$0.7943 10.4553$\pm$1.4677 8.6622$\pm$1.3565 5 8.0177$\pm$0.7413 8.5133$\pm$0.8507 8.1563$\pm$0.8538 10.5221$\pm$1.3204 8.0167$\pm$1.2904 6 8.3434$\pm$0.7236 8.2713$\pm$0.5728 8.1612$\pm$0.8389 8.1860$\pm$0.7925 8.1659$\pm$0.8225 7 8.2747$\pm$0.5815 8.2820$\pm$0.5787 8.1869$\pm$0.7809 8.1872$\pm$0.7735 8.1868$\pm$0.7903 8 8.1852$\pm$0.7413 8.1842$\pm$0.7477 8.1842$\pm$0.7477 8.1842$\pm$0.7477 13.1814$\pm$1.5359
 Error Dim SLS-SDPP SDPP PLS SPCA KDR RMSE 1 10.4649$\pm$1.2072 10.5241$\pm$1.4220 12.7666$\pm$2.1539 12.8090$\pm$2.0429 13.7423$\pm$2.7046 2 10.4540$\pm$1.1356 10.4075$\pm$1.5903 11.8629$\pm$1.1837 12.8712$\pm$1.9074 11.6399$\pm$2.6641 3 10.4629$\pm$1.1648 10.4079$\pm$1.5915 10.9379$\pm$1.1096 12.8370$\pm$1.9316 11.5163$\pm$2.2178 4 10.4450$\pm$1.2848 10.5890$\pm$1.3303 10.5108$\pm$1.1943 12.8674$\pm$1.8353 11.0320$\pm$1.7599 5 10.2932$\pm$0.7866 10.7247$\pm$1.0038 10.4432$\pm$1.1883 12.9647$\pm$1.5969 10.2933$\pm$1.6597 6 10.9910$\pm$0.7200 10.4893$\pm$0.7228 10.4359$\pm$1.1480 10.4770$\pm$1.1142 10.4473$\pm$1.1485 7 10.4495$\pm$0.7052 10.5313$\pm$0.7228 10.4480$\pm$1.0746 10.4520$\pm$1.0604 10.4514$\pm$1.0952 8 10.4349$\pm$1.1134 10.4342$\pm$1.0034 10.4342$\pm$1.0034 10.4342$\pm$1.0034 16.3019$\pm$1.8078 MAE 1 8.3879$\pm$1.0882 8.3687$\pm$1.0994 10.3995$\pm$1.7948 10.4451$\pm$1.6868 10.8884$\pm$2.3565 2 8.2339$\pm$1.2642 8.2338$\pm$1.2918 9.3977$\pm$0.8294 10.4484$\pm$1.5463 9.2285$\pm$2.3954 3 8.2288$\pm$1.3385 8.2342$\pm$1.3380 8.3771$\pm$0.6452 10.4125$\pm$1.5656 8.9818$\pm$1.6882 4 8.5898$\pm$1.1262 8.3935$\pm$1.1461 8.2199$\pm$0.7943 10.4553$\pm$1.4677 8.6622$\pm$1.3565 5 8.0177$\pm$0.7413 8.5133$\pm$0.8507 8.1563$\pm$0.8538 10.5221$\pm$1.3204 8.0167$\pm$1.2904 6 8.3434$\pm$0.7236 8.2713$\pm$0.5728 8.1612$\pm$0.8389 8.1860$\pm$0.7925 8.1659$\pm$0.8225 7 8.2747$\pm$0.5815 8.2820$\pm$0.5787 8.1869$\pm$0.7809 8.1872$\pm$0.7735 8.1868$\pm$0.7903 8 8.1852$\pm$0.7413 8.1842$\pm$0.7477 8.1842$\pm$0.7477 8.1842$\pm$0.7477 13.1814$\pm$1.5359
Average error rate of class prediction of test set for CTG data
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 2 0.2251 0.2865 0.2739 0.2593 0.208 3 0.1940 0.2827 0.2661 0.2992 0.208 4 0.1949 0.2943 0.3314 0.2427 0.208 5 0.1969 0.2661 0.3372 0.2437 0.208 6 0.1969 0.2749 0.2710 0.2115 0.208 7 0.1988 0.2827 0.2768 0.2193 0.208 8 0.1979 0.2768 0.2817 0.2300 0.208 9 0.1988 0.2700 0.2612 0.2315 0.208 10 0.1949 0.2690 0.2515 0.2412 0.208
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 2 0.2251 0.2865 0.2739 0.2593 0.208 3 0.1940 0.2827 0.2661 0.2992 0.208 4 0.1949 0.2943 0.3314 0.2427 0.208 5 0.1969 0.2661 0.3372 0.2437 0.208 6 0.1969 0.2749 0.2710 0.2115 0.208 7 0.1988 0.2827 0.2768 0.2193 0.208 8 0.1979 0.2768 0.2817 0.2300 0.208 9 0.1988 0.2700 0.2612 0.2315 0.208 10 0.1949 0.2690 0.2515 0.2412 0.208
Average classification error rate of test set for Seismic bump data
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 1 0.0328 0.0328 0.0328 0.0328 0.0328 2 0.0701 0.0707 0.1004 0.0688 0.0328 3 0.0701 0.0669 0.0694 0.0669 0.0328 4 0.0701 0.0720 0.0676 0.0720 0.032 5 0.0701 0.0720 0.0732 0.0726 0.0328 6 0.0701 0.0713 0.0789 0.0728 0.0328 7 0.0701 0.0713 0.0789 0.0727 0.0328 8 0.0701 0.0713 0.0795 0.0729 0.0328 9 0.0701 0.0713 0.0795 0.0730 0.0328 10 0.0701 0.0713 0.0795 0.0730 0.0328
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 1 0.0328 0.0328 0.0328 0.0328 0.0328 2 0.0701 0.0707 0.1004 0.0688 0.0328 3 0.0701 0.0669 0.0694 0.0669 0.0328 4 0.0701 0.0720 0.0676 0.0720 0.032 5 0.0701 0.0720 0.0732 0.0726 0.0328 6 0.0701 0.0713 0.0789 0.0728 0.0328 7 0.0701 0.0713 0.0789 0.0727 0.0328 8 0.0701 0.0713 0.0795 0.0729 0.0328 9 0.0701 0.0713 0.0795 0.0730 0.0328 10 0.0701 0.0713 0.0795 0.0730 0.0328
Average error rate of class prediction of test set for Diabetic Retinopathy data
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 1 0.6382 0.6382 0.6382 0.6382 0.6382 2 0.2678 0.3390 0.2963 0.2934 0.6382 3 0.2678 0.3105 0.3618 0.2963 0.6382 4 0.2678 0.3191 0.2877 0.3048 0.6382 5 0.2678 0.2934 0.2906 0.3134 0.6382 6 0.2678 0.2849 0.2877 0.3048 0.6382 7 0.2735 0.3191 0.2963 0.3048 0.6382 8 0.2707 0.3191 0.2934 0.3134 0.6382 9 0.2707 0.2906 0.2906 0.3134 0.6382 10 0.2678 0.3020 0.3077 0.3048 0.6382
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 1 0.6382 0.6382 0.6382 0.6382 0.6382 2 0.2678 0.3390 0.2963 0.2934 0.6382 3 0.2678 0.3105 0.3618 0.2963 0.6382 4 0.2678 0.3191 0.2877 0.3048 0.6382 5 0.2678 0.2934 0.2906 0.3134 0.6382 6 0.2678 0.2849 0.2877 0.3048 0.6382 7 0.2735 0.3191 0.2963 0.3048 0.6382 8 0.2707 0.3191 0.2934 0.3134 0.6382 9 0.2707 0.2906 0.2906 0.3134 0.6382 10 0.2678 0.3020 0.3077 0.3048 0.6382
Average error rate of class prediction of test set for Mushroom data
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 1 0.2486 0.2486 0.2486 0.2486 0.2486 2 0.3555 0.1318 0.2735 0.3037 0.2486 3 0.2891 0.1266 0.1836 0.2741 0.2486 4 0.1516 0.2076 0.2020 0.2018 0.2486 5 0.1648 0.2524 0.1911 0.2311 0.2486 6 0.1667 0.2524 0.3785 0.3815 0.2486 7 0.1723 0.2693 0.3653 0.3544 0.2486 8 0.1728 0.2255 0.3630 0.2587 0.2486 9 0.1186 0.2655 0.1756 0.1615 0.2486 10 0.1427 0.2665 0.3545 0.2812 0.2486
 Error Dim SLS-SDPP SDPP SPCA KDR FDA Error Rate 1 0.2486 0.2486 0.2486 0.2486 0.2486 2 0.3555 0.1318 0.2735 0.3037 0.2486 3 0.2891 0.1266 0.1836 0.2741 0.2486 4 0.1516 0.2076 0.2020 0.2018 0.2486 5 0.1648 0.2524 0.1911 0.2311 0.2486 6 0.1667 0.2524 0.3785 0.3815 0.2486 7 0.1723 0.2693 0.3653 0.3544 0.2486 8 0.1728 0.2255 0.3630 0.2587 0.2486 9 0.1186 0.2655 0.1756 0.1615 0.2486 10 0.1427 0.2665 0.3545 0.2812 0.2486
 [1] Russell E. Warren, Stanley J. Osher. Hyperspectral unmixing by the alternating direction method of multipliers. Inverse Problems & Imaging, 2015, 9 (3) : 917-933. doi: 10.3934/ipi.2015.9.917 [2] Bingsheng He, Xiaoming Yuan. Linearized alternating direction method of multipliers with Gaussian back substitution for separable convex programming. Numerical Algebra, Control & Optimization, 2013, 3 (2) : 247-260. doi: 10.3934/naco.2013.3.247 [3] Zhongming Wu, Xingju Cai, Deren Han. Linearized block-wise alternating direction method of multipliers for multiple-block convex programming. Journal of Industrial & Management Optimization, 2018, 14 (3) : 833-855. doi: 10.3934/jimo.2017078 [4] Yuan Shen, Lei Ji. Partial convolution for total variation deblurring and denoising by new linearized alternating direction method of multipliers with extension step. Journal of Industrial & Management Optimization, 2019, 15 (1) : 159-175. doi: 10.3934/jimo.2018037 [5] Ning Zhang, Qiang Wu. Online learning for supervised dimension reduction. Mathematical Foundations of Computing, 2019, 2 (2) : 95-106. doi: 10.3934/mfc.2019008 [6] Sobhan Seyfaddini. Unboundedness of the Lagrangian Hofer distance in the Euclidean ball. Electronic Research Announcements, 2014, 21: 1-7. doi: 10.3934/era.2014.21.1 [7] Yue Lu, Ying-En Ge, Li-Wei Zhang. An alternating direction method for solving a class of inverse semi-definite quadratic programming problems. Journal of Industrial & Management Optimization, 2016, 12 (1) : 317-336. doi: 10.3934/jimo.2016.12.317 [8] Peter Frolkovič, Karol Mikula, Jozef Urbán. Distance function and extension in normal direction for implicitly defined interfaces. Discrete & Continuous Dynamical Systems - S, 2015, 8 (5) : 871-880. doi: 10.3934/dcdss.2015.8.871 [9] Foxiang Liu, Lingling Xu, Yuehong Sun, Deren Han. A proximal alternating direction method for multi-block coupled convex optimization. Journal of Industrial & Management Optimization, 2019, 15 (2) : 723-737. doi: 10.3934/jimo.2018067 [10] Yi Xu, Wenyu Sun. A filter successive linear programming method for nonlinear semidefinite programming problems. Numerical Algebra, Control & Optimization, 2012, 2 (1) : 193-206. doi: 10.3934/naco.2012.2.193 [11] Rhudaina Z. Mohammad, Karel Švadlenka. Multiphase volume-preserving interface motions via localized signed distance vector scheme. Discrete & Continuous Dynamical Systems - S, 2015, 8 (5) : 969-988. doi: 10.3934/dcdss.2015.8.969 [12] José M. Arrieta, Esperanza Santamaría. Estimates on the distance of inertial manifolds. Discrete & Continuous Dynamical Systems - A, 2014, 34 (10) : 3921-3944. doi: 10.3934/dcds.2014.34.3921 [13] Liliana Trejo-Valencia, Edgardo Ugalde. Projective distance and $g$-measures. Discrete & Continuous Dynamical Systems - B, 2015, 20 (10) : 3565-3579. doi: 10.3934/dcdsb.2015.20.3565 [14] Hao Chen, Kaitai Li, Yuchuan Chu, Zhiqiang Chen, Yiren Yang. A dimension splitting and characteristic projection method for three-dimensional incompressible flow. Discrete & Continuous Dynamical Systems - B, 2019, 24 (1) : 127-147. doi: 10.3934/dcdsb.2018111 [15] Y. Goto, K. Ishii, T. Ogawa. Method of the distance function to the Bence-Merriman-Osher algorithm for motion by mean curvature. Communications on Pure & Applied Analysis, 2005, 4 (2) : 311-339. doi: 10.3934/cpaa.2005.4.311 [16] Qinghong Zhang, Gang Chen, Ting Zhang. Duality formulations in semidefinite programming. Journal of Industrial & Management Optimization, 2010, 6 (4) : 881-893. doi: 10.3934/jimo.2010.6.881 [17] Yunhai Xiao, Soon-Yi Wu, Bing-Sheng He. A proximal alternating direction method for $\ell_{2,1}$-norm least squares problem in multi-task feature learning. Journal of Industrial & Management Optimization, 2012, 8 (4) : 1057-1069. doi: 10.3934/jimo.2012.8.1057 [18] Yang Li, Yonghong Ren, Yun Wang, Jian Gu. Convergence analysis of a nonlinear Lagrangian method for nonconvex semidefinite programming with subproblem inexactly solved. Journal of Industrial & Management Optimization, 2015, 11 (1) : 65-81. doi: 10.3934/jimo.2015.11.65 [19] Yang Li, Liwei Zhang. A nonlinear Lagrangian method based on Log-Sigmoid function for nonconvex semidefinite programming. Journal of Industrial & Management Optimization, 2009, 5 (3) : 651-669. doi: 10.3934/jimo.2009.5.651 [20] Konstantinos Drakakis, Roderick Gow, Scott Rickard. Common distance vectors between Costas arrays. Advances in Mathematics of Communications, 2009, 3 (1) : 35-52. doi: 10.3934/amc.2009.3.35

2018 Impact Factor: 1.025