# American Institute of Mathematical Sciences

June  2014, 7(3): 503-523. doi: 10.3934/dcdss.2014.7.503

## Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO

 1 Electrical Engineering and Computer Science, UC Berkeley, United States, United States 2 Mechanical Engineering, UC Berkeley, United States, United States 3 Electrical Engineering and Computer Science, Civil and Environmental Engineering, UC Berkeley, United States

Received  May 2013 Revised  August 2013 Published  January 2013

The LASSO is a widely used shrinkage and selection method for linear regression. We propose a generalization of the LASSO in which the $l_1$ penalty is applied on a linear transformation of the regression parameters, allowing to input prior information on the structure of the problem and to improve interpretability of the results. We also study time varying system with an $l_1$-penalty on the variations of the state, leading to estimates that exhibit few jumps''. We propose a homotopy algorithm that updates the solution as additional measurements are available. The algorithm takes advantage of the sparsity of the solution for computational efficiency and is promising for mining large datasets. The algorithm is implemented on three experimental data sets representing applications to traffic estimation from sparsely sampled probe vehicles, flow estimation in tidal channels and text analysis of on-line news.
Citation: Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete & Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503
##### References:
 [1] J. Anderson and M. Mierzwa, An introduction to the Delta Simulation Model II (DSM2) for simulation of hydrodynamics and water quality of the Sacramento-San Joaquin Delta,, in Office of State Water Project Planning, (2002). Google Scholar [2] F. Bach, R. Jenatton, J. Mairal and G. Obozinski, Convex optimization with sparsity-inducing norms,, in Optimization for Machine Learning, (2011), 19. Google Scholar [3] R. Baraniuk, Compressive sensing,, IEEE Signal. Proc. Mag., 24 (2007). doi: 10.1109/CISS.2008.4558479. Google Scholar [4] A. Bayen, J. Butler and A. Patire, et al., Mobile Millennium,, Final Report, (2011), 2011. Google Scholar [5] D. Bertsekas, Incremental gradient, subgradient, and proximal methods for convex optimization: A survey,, Optimization for Machine Learning, (2011). Google Scholar [6] E. Candès, Compressive sampling,, Congress of Mathematicians, 3 (2006), 1433. Google Scholar [7] V. Chow, Open-channel Hydraulics,, McGraw-Hill Book Company, (1988). Google Scholar [8] J. Cunge, F. Holly and A. Verwey, Practical Aspects of Computational River Hydraulics,, Pitman, (1980). Google Scholar [9] I. Daubechies, M. Defrise and C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,, Communications on Pure and Applied Mathematics, 57 (2004), 1413. doi: 10.1002/cpa.20042. Google Scholar [10] Y. Dodge, Statistical Data Analysis Based on the $l_1$-Norm and Related Methods,, Birkhäuser Verlag, (2002). doi: 10.1007/978-3-0348-8201-9. Google Scholar [11] I. Drori and D. Donoho, Solution of $l_1$ minimization problems by LARS/homotopy methods,, in International Conference on Acoustics, (2006). doi: 10.1109/ICASSP.2006.1660734. Google Scholar [12] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least angle regression,, Ann. Statist., 32 (2004), 407. doi: 10.1214/009053604000000067. Google Scholar [13] M. Figueiredo and R. Nowak, A bound optimization approach to wavelet-based image deconvolution,, in International Conference on Image Processing, (2005). doi: 10.1109/ICIP.2005.1530172. Google Scholar [14] M. Figueiredo, R. Nowak and S. Wright, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,, IEEE Journal of selected topics in signal processing, 1 (2008), 586. doi: 10.1109/JSTSP.2007.910281. Google Scholar [15] J. Friedman, T. Hastie, H. Höfling and R. Tibshirani, Pathwise coordinate optimization,, Ann. Appl. Stat., 1 (2007), 302. doi: 10.1214/07-AOAS131. Google Scholar [16] J. Fuchs, On sparse representations in arbitrary redundant bases,, IEEE Transactions on Information Theory, 50 (2004), 1341. doi: 10.1109/TIT.2004.828141. Google Scholar [17] P. Garrigues and L. El Ghaoui, An homotopy algorithm for the Lasso with online observations,, in Neural Information Processing Systems, 21 (2008). Google Scholar [18] B. Gawalt, J. Jia, L. Miratrix, L. El Ghaoui, B. Yu and S. Clavier, Discovering word associations in news media via feature selection and sparse classification,, in Proceedings of the International Conference on Multimedia Information Retrieval, (2010), 211. doi: 10.1145/1743384.1743421. Google Scholar [19] G. Golub and C. Van Loan, Matrix Computations,, Third edition, (1996). Google Scholar [20] I. Guyon and A. Elisseeff, An introduction to variable and feature selection,, The Journal of Machine Learning Research, 3 (2003), 1157. Google Scholar [21] A. Hofleitner, R. Herring, P. Abbeel and A. Bayen, Learning the dynamics of arterial traffic from probe data using a Dynamic Bayesian Network,, IEEE Transactions on Intelligent Transportation Systems, 13 (2012), 1679. doi: 10.1109/TITS.2012.2200474. Google Scholar [22] A. Hofleitner, T. Rabbani, L. El Ghaoui and A. Bayen, Online homotopy algorithm for a generalization of the LASSO,, IEEE Transactions on Automatic Control, 58 (2013), 3175. doi: 10.1109/TAC.2013.2259373. Google Scholar [23] T. Hunter, T. Moldovan, M. Zaharia, S. Merzgui, J. Ma, M. J. Franklin, P. Abbeel and A. M. Bayen, Scaling the Mobile Millennium system in the cloud,, in Proceedings of the 2nd ACM Symposium on Cloud Computing, (2011), 1. doi: 10.1145/2038916.2038944. Google Scholar [24] S. Kim, K. Koh, S. Boyd and D. Gorinevsky, $l_1$ trend filtering,, SIAM Rev., 51 (2009), 339. doi: 10.1137/070690274. Google Scholar [25] S. Kim, K. Koh, M. Lustig, S. Boyd and D. Gorinevsky, An Interior-Point Method for Large-Scale $l_1$-Regularized Least Squares,, IEEE Journal of Selected Topics in Signal Processing, 1 (2008), 606. Google Scholar [26] K. Knight and W. Fu, Asymptotics for lasso-type estimators,, Annals of Statistics, 28 (2000), 1356. doi: 10.1214/aos/1015957397. Google Scholar [27] H. Lee, A. Battle, R. Raina and A. Ng, Efficient sparse coding algorithms,, Advances in Neural Information Processing Systems, 19 (2007). Google Scholar [28] X. Litrico and V. Fromion, Boundary control of linearized Saint-Venant equations oscillating modes,, Automatica, 42 (2006), 967. doi: 10.1016/j.automatica.2006.02.002. Google Scholar [29] X. Litrico and V. Fromion, Modeling and Control of Hydrosystems,, Springer, (2009). doi: 10.1007/978-1-84882-624-3. Google Scholar [30] X. Litrico and V. Fromion, Frequency modeling of open-channel flow,, ASCE Journal of Journal of Hydraulic Engineering, 130 (2004), 806. doi: 10.1061/(ASCE)0733-9429(2004)130:8(806). Google Scholar [31] I. Loris, On the performance of algorithms for the minimization of $l_1$-penalized functionals,, Inverse Problems, 25 (2009). doi: 10.1088/0266-5611/25/3/035008. Google Scholar [32] J. Mairal and B. Yu, Complexity analysis of the Lasso regularization path,, in International Conference on Machine Learning, (2012). Google Scholar [33] A. Ng, Feature selection, $l_1$ vs. $l_2$ regularization, and rotational invariance,, in 21st International Conference on Machine learning, (2004). Google Scholar [34] M. Osborne, B. Presnell and B. Turlach, A new approach to variable selection in least squares problems,, IMA journal of numerical analysis, 20 (2000), 389. doi: 10.1093/imanum/20.3.389. Google Scholar [35] S. Rosset and J. Zhu, Piecewise linear regularized solution paths,, The Annals of Statistics, 35 (2007), 1012. doi: 10.1214/009053606000001370. Google Scholar [36] M. Salman Asif and J. Romberg, Dynamic Updating for $l_1$ regularization,, IEEE Journal of Selected Topics in Signal Processing, 4 (2010), 421. Google Scholar [37] R. Tibshirani, Regression shrinkage and selection via the Lasso,, Journal of the Royal Statistical Society. Series B (Methodological), 58 (1996), 267. Google Scholar [38] H. Zou and T. Hastie, Regularization and variable selection via the elastic net,, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67 (2005), 301. doi: 10.1111/j.1467-9868.2005.00503.x. Google Scholar

show all references

##### References:
 [1] J. Anderson and M. Mierzwa, An introduction to the Delta Simulation Model II (DSM2) for simulation of hydrodynamics and water quality of the Sacramento-San Joaquin Delta,, in Office of State Water Project Planning, (2002). Google Scholar [2] F. Bach, R. Jenatton, J. Mairal and G. Obozinski, Convex optimization with sparsity-inducing norms,, in Optimization for Machine Learning, (2011), 19. Google Scholar [3] R. Baraniuk, Compressive sensing,, IEEE Signal. Proc. Mag., 24 (2007). doi: 10.1109/CISS.2008.4558479. Google Scholar [4] A. Bayen, J. Butler and A. Patire, et al., Mobile Millennium,, Final Report, (2011), 2011. Google Scholar [5] D. Bertsekas, Incremental gradient, subgradient, and proximal methods for convex optimization: A survey,, Optimization for Machine Learning, (2011). Google Scholar [6] E. Candès, Compressive sampling,, Congress of Mathematicians, 3 (2006), 1433. Google Scholar [7] V. Chow, Open-channel Hydraulics,, McGraw-Hill Book Company, (1988). Google Scholar [8] J. Cunge, F. Holly and A. Verwey, Practical Aspects of Computational River Hydraulics,, Pitman, (1980). Google Scholar [9] I. Daubechies, M. Defrise and C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,, Communications on Pure and Applied Mathematics, 57 (2004), 1413. doi: 10.1002/cpa.20042. Google Scholar [10] Y. Dodge, Statistical Data Analysis Based on the $l_1$-Norm and Related Methods,, Birkhäuser Verlag, (2002). doi: 10.1007/978-3-0348-8201-9. Google Scholar [11] I. Drori and D. Donoho, Solution of $l_1$ minimization problems by LARS/homotopy methods,, in International Conference on Acoustics, (2006). doi: 10.1109/ICASSP.2006.1660734. Google Scholar [12] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least angle regression,, Ann. Statist., 32 (2004), 407. doi: 10.1214/009053604000000067. Google Scholar [13] M. Figueiredo and R. Nowak, A bound optimization approach to wavelet-based image deconvolution,, in International Conference on Image Processing, (2005). doi: 10.1109/ICIP.2005.1530172. Google Scholar [14] M. Figueiredo, R. Nowak and S. Wright, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,, IEEE Journal of selected topics in signal processing, 1 (2008), 586. doi: 10.1109/JSTSP.2007.910281. Google Scholar [15] J. Friedman, T. Hastie, H. Höfling and R. Tibshirani, Pathwise coordinate optimization,, Ann. Appl. Stat., 1 (2007), 302. doi: 10.1214/07-AOAS131. Google Scholar [16] J. Fuchs, On sparse representations in arbitrary redundant bases,, IEEE Transactions on Information Theory, 50 (2004), 1341. doi: 10.1109/TIT.2004.828141. Google Scholar [17] P. Garrigues and L. El Ghaoui, An homotopy algorithm for the Lasso with online observations,, in Neural Information Processing Systems, 21 (2008). Google Scholar [18] B. Gawalt, J. Jia, L. Miratrix, L. El Ghaoui, B. Yu and S. Clavier, Discovering word associations in news media via feature selection and sparse classification,, in Proceedings of the International Conference on Multimedia Information Retrieval, (2010), 211. doi: 10.1145/1743384.1743421. Google Scholar [19] G. Golub and C. Van Loan, Matrix Computations,, Third edition, (1996). Google Scholar [20] I. Guyon and A. Elisseeff, An introduction to variable and feature selection,, The Journal of Machine Learning Research, 3 (2003), 1157. Google Scholar [21] A. Hofleitner, R. Herring, P. Abbeel and A. Bayen, Learning the dynamics of arterial traffic from probe data using a Dynamic Bayesian Network,, IEEE Transactions on Intelligent Transportation Systems, 13 (2012), 1679. doi: 10.1109/TITS.2012.2200474. Google Scholar [22] A. Hofleitner, T. Rabbani, L. El Ghaoui and A. Bayen, Online homotopy algorithm for a generalization of the LASSO,, IEEE Transactions on Automatic Control, 58 (2013), 3175. doi: 10.1109/TAC.2013.2259373. Google Scholar [23] T. Hunter, T. Moldovan, M. Zaharia, S. Merzgui, J. Ma, M. J. Franklin, P. Abbeel and A. M. Bayen, Scaling the Mobile Millennium system in the cloud,, in Proceedings of the 2nd ACM Symposium on Cloud Computing, (2011), 1. doi: 10.1145/2038916.2038944. Google Scholar [24] S. Kim, K. Koh, S. Boyd and D. Gorinevsky, $l_1$ trend filtering,, SIAM Rev., 51 (2009), 339. doi: 10.1137/070690274. Google Scholar [25] S. Kim, K. Koh, M. Lustig, S. Boyd and D. Gorinevsky, An Interior-Point Method for Large-Scale $l_1$-Regularized Least Squares,, IEEE Journal of Selected Topics in Signal Processing, 1 (2008), 606. Google Scholar [26] K. Knight and W. Fu, Asymptotics for lasso-type estimators,, Annals of Statistics, 28 (2000), 1356. doi: 10.1214/aos/1015957397. Google Scholar [27] H. Lee, A. Battle, R. Raina and A. Ng, Efficient sparse coding algorithms,, Advances in Neural Information Processing Systems, 19 (2007). Google Scholar [28] X. Litrico and V. Fromion, Boundary control of linearized Saint-Venant equations oscillating modes,, Automatica, 42 (2006), 967. doi: 10.1016/j.automatica.2006.02.002. Google Scholar [29] X. Litrico and V. Fromion, Modeling and Control of Hydrosystems,, Springer, (2009). doi: 10.1007/978-1-84882-624-3. Google Scholar [30] X. Litrico and V. Fromion, Frequency modeling of open-channel flow,, ASCE Journal of Journal of Hydraulic Engineering, 130 (2004), 806. doi: 10.1061/(ASCE)0733-9429(2004)130:8(806). Google Scholar [31] I. Loris, On the performance of algorithms for the minimization of $l_1$-penalized functionals,, Inverse Problems, 25 (2009). doi: 10.1088/0266-5611/25/3/035008. Google Scholar [32] J. Mairal and B. Yu, Complexity analysis of the Lasso regularization path,, in International Conference on Machine Learning, (2012). Google Scholar [33] A. Ng, Feature selection, $l_1$ vs. $l_2$ regularization, and rotational invariance,, in 21st International Conference on Machine learning, (2004). Google Scholar [34] M. Osborne, B. Presnell and B. Turlach, A new approach to variable selection in least squares problems,, IMA journal of numerical analysis, 20 (2000), 389. doi: 10.1093/imanum/20.3.389. Google Scholar [35] S. Rosset and J. Zhu, Piecewise linear regularized solution paths,, The Annals of Statistics, 35 (2007), 1012. doi: 10.1214/009053606000001370. Google Scholar [36] M. Salman Asif and J. Romberg, Dynamic Updating for $l_1$ regularization,, IEEE Journal of Selected Topics in Signal Processing, 4 (2010), 421. Google Scholar [37] R. Tibshirani, Regression shrinkage and selection via the Lasso,, Journal of the Royal Statistical Society. Series B (Methodological), 58 (1996), 267. Google Scholar [38] H. Zou and T. Hastie, Regularization and variable selection via the elastic net,, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67 (2005), 301. doi: 10.1111/j.1467-9868.2005.00503.x. Google Scholar
 [1] Adil Bagirov, Sona Taheri, Soodabeh Asadi. A difference of convex optimization algorithm for piecewise linear regression. Journal of Industrial & Management Optimization, 2019, 15 (2) : 909-932. doi: 10.3934/jimo.2018077 [2] Yong Wang, Wanquan Liu, Guanglu Zhou. An efficient algorithm for non-convex sparse optimization. Journal of Industrial & Management Optimization, 2019, 15 (4) : 2009-2021. doi: 10.3934/jimo.2018134 [3] Lipu Zhang, Yinghong Xu, Zhengjing Jin. An efficient algorithm for convex quadratic semi-definite optimization. Numerical Algebra, Control & Optimization, 2012, 2 (1) : 129-144. doi: 10.3934/naco.2012.2.129 [4] Yigui Ou, Xin Zhou. A modified scaled memoryless BFGS preconditioned conjugate gradient algorithm for nonsmooth convex optimization. Journal of Industrial & Management Optimization, 2018, 14 (2) : 785-801. doi: 10.3934/jimo.2017075 [5] Yanqin Bai, Lipu Zhang. A full-Newton step interior-point algorithm for symmetric cone convex quadratic optimization. Journal of Industrial & Management Optimization, 2011, 7 (4) : 891-906. doi: 10.3934/jimo.2011.7.891 [6] Jiping Tao, Ronghuan Huang, Tundong Liu. A $2.28$-competitive algorithm for online scheduling on identical machines. Journal of Industrial & Management Optimization, 2015, 11 (1) : 185-198. doi: 10.3934/jimo.2015.11.185 [7] Joon Kwon, Panayotis Mertikopoulos. A continuous-time approach to online optimization. Journal of Dynamics & Games, 2017, 4 (2) : 125-148. doi: 10.3934/jdg.2017008 [8] Murat Adivar, Shu-Cherng Fang. Convex optimization on mixed domains. Journal of Industrial & Management Optimization, 2012, 8 (1) : 189-227. doi: 10.3934/jimo.2012.8.189 [9] Ruiqi Yang, Dachuan Xu, Yicheng Xu, Dongmei Zhang. An adaptive probabilistic algorithm for online k-center clustering. Journal of Industrial & Management Optimization, 2019, 15 (2) : 565-576. doi: 10.3934/jimo.2018057 [10] Zhenhuan Yang, Yiming Ying, Qilong Min. Online optimization for residential PV-ESS energy system scheduling. Mathematical Foundations of Computing, 2019, 2 (1) : 55-71. doi: 10.3934/mfc.2019005 [11] Anulekha Dhara, Aparna Mehra. Conjugate duality for generalized convex optimization problems. Journal of Industrial & Management Optimization, 2007, 3 (3) : 415-427. doi: 10.3934/jimo.2007.3.415 [12] Igor Griva, Roman A. Polyak. Proximal point nonlinear rescaling method for convex optimization. Numerical Algebra, Control & Optimization, 2011, 1 (2) : 283-299. doi: 10.3934/naco.2011.1.283 [13] Nobuko Sagara, Masao Fukushima. trust region method for nonsmooth convex optimization. Journal of Industrial & Management Optimization, 2005, 1 (2) : 171-180. doi: 10.3934/jimo.2005.1.171 [14] Zhongliang Deng, Enwen Hu. Error minimization with global optimization for difference of convex functions. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1027-1033. doi: 10.3934/dcdss.2019070 [15] Jiping Tao, Zhijun Chao, Yugeng Xi. A semi-online algorithm and its competitive analysis for a single machine scheduling problem with bounded processing times. Journal of Industrial & Management Optimization, 2010, 6 (2) : 269-282. doi: 10.3934/jimo.2010.6.269 [16] Ran Ma, Jiping Tao. An improved 2.11-competitive algorithm for online scheduling on parallel machines to minimize total weighted completion time. Journal of Industrial & Management Optimization, 2018, 14 (2) : 497-510. doi: 10.3934/jimo.2017057 [17] Xiangyu Gao, Yong Sun. A new heuristic algorithm for laser antimissile strategy optimization. Journal of Industrial & Management Optimization, 2012, 8 (2) : 457-468. doi: 10.3934/jimo.2012.8.457 [18] Jianjun Liu, Min Zeng, Yifan Ge, Changzhi Wu, Xiangyu Wang. Improved Cuckoo Search algorithm for numerical function optimization. Journal of Industrial & Management Optimization, 2017, 13 (5) : 1-13. doi: 10.3934/jimo.2018142 [19] Sanming Liu, Zhijie Wang, Chongyang Liu. Proximal iterative Gaussian smoothing algorithm for a class of nonsmooth convex minimization problems. Numerical Algebra, Control & Optimization, 2015, 5 (1) : 79-89. doi: 10.3934/naco.2015.5.79 [20] Yazheng Dang, Fanwen Meng, Jie Sun. Convergence analysis of a parallel projection algorithm for solving convex feasibility problems. Numerical Algebra, Control & Optimization, 2016, 6 (4) : 505-519. doi: 10.3934/naco.2016023

2018 Impact Factor: 0.545