Potential-Function Proofs for Gradient Methods
by Nikhil Bansal and Anupam Gupta
Theory of Computing, Volume 15(4), pp. 1-32, 2019
Bibliography with links to cited articles
[1] Zeyuan Allen-Zhu and Lorenzo Orecchia: Linear coupling: an ultimate unification of gradient and mirror descent. In Proc. 8th Innovations in Theoret. Comp. Sci. Conf. (ITCS’17), pp. 3:1–3:22. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2017. [doi:10.4230/LIPIcs.ITCS.2017.3, arXiv:1407.1537]
[2] Amir Beck and Marc Teboulle: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett., 31(3):167–175, 2003. [doi:10.1016/S0167-6377(02)00231-6]
[3] Aharon Ben-Tal and Arkadi Nemirovski: Lectures on Modern Convex Optimization. Society for Industrial and Applied Mathematics, 2001. [doi:10.1137/1.9780898718829]
[4] Dmitri P. Bertsekas, Angelia Nedić, and Asuman E. Ozdaglar: Convex Analysis and Optimization. Athena Scientific, 2003.
[5] Stephen Boyd and Lieven Vandenberghe: Convex Optimization. Cambridge University Press, 2004. [doi:10.1017/CBO9780511804441]
[6] Sébastien Bubeck: Convex optimization: algorithms and complexity. Foundations and Trends in Machine Learning, 8(3-4):231–357, 2015. [doi:10.1561/2200000050, arXiv:1405.4980]
[7] Nicolò Cesa-Bianchi and Gábor Lugosi: Prediction, Learning, and Games. Cambridge University Press, Cambridge, 2006.
[8] Jelena Diakonikolas and Lorenzo Orecchia: The approximate gap technique: a unified approach to optimal first-order methods. SIAM J. Optim., 29(1):660–689, 2019. SIAM. [arXiv:1712.02485]
[9] Yoel Drori and Marc Teboulle: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program., 145(1-2):451–482, 2014. [doi:10.1007/s10107-013-0653-0, arXiv:1206.3209]
[10] John C. Duchi: Introductory Lectures on Stochastic Optimization, 2016. Available at author’s webpage.
[11] Marguerite Frank and Philip Wolfe: An algorithm for quadratic programming. Naval Research Logistics Quarterly, 3(1-2):95–110, 1956. [doi:10.1002/nav.3800030109]
[12] Elad Hazan: Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4):157–325, 2016. [doi:10.1561/2400000013]
[13] Jean-Baptiste Hiriart-Urruty and Claude Lemaréchal: Fundamentals of Convex Analysis. Springer, 2001. [doi:10.1007/978-3-642-56468-0]
[14] Martin Jaggi: Revisiting Frank–Wolfe: projection-free sparse convex optimization. In Proc. 13th Internat. Conf. on Machine Learning (ICML’13), pp. 427–435, 2013. [PMLR].
[15] Sahar Karimi and Stephen A. Vavasis: A single potential governing convergence of conjugate gradient, accelerated gradient and geometric descent, 2017. [arXiv:1712.09498]
[16] Donghwan Kim and Jeffrey A. Fessler: Optimized first-order methods for smooth convex minimization. Math. Program., 159(1-2):81–107, 2016. [doi:10.1007/s10107-015-0949-3, arXiv:1406.5468]
[17] Walid Krichene, Alexandre M. Bayen, and Peter L. Bartlett: Accelerated mirror descent in continuous and discrete time. In Adv. in Neural Inform. Processing Systems 28 (NIPS’15), pp. 2845–2853, 2015. NIPS.
[18] Arkadi Semënovich Nemirovski and David Berkovich Yudin: Problem Complexity and Method Efficiency in Optimization. John Wiley & Sons, Inc., 1983.
[19] Yurii Nesterov: A method of solving a convex programming problem with convergence rate O(1∕k2). Soviet Mathematics Doklady, 27:372–376, 1983. LINK at Math-Net.ru.
[20] Yurii Nesterov: Introductory Lectures on Convex Optimization – A Basic Course. Kluwer Academic Publishers, 2004. [doi:10.1007/978-1-4419-8853-9]
[21] Javier Peña: Convergence of first-order methods via the convex conjugate. Oper. Res. Lett., 45(6):561–564, 2017. [doi:10.1016/j.orl.2017.08.013, arXiv:1707.09084]
[22] Shai Shalev-Shwartz: Online learning and online convex optimization. Foundations and Trends in Machine Learning, 4(2):107–194, 2012. [doi:10.1561/2200000018]
[23] Naum Zuselevich Shor: Minimization Methods for Non-differentiable Functions. Springer, 1985. [doi:10.1007/978-3-642-82118-9]
[24] Weijie Su, Stephen Boyd, and Emmanuel J. Candès: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. (JMLR), 17(153):1–43, 2016.
[25] Eiji Takimoto and Manfred K. Warmuth: The minimax strategy for Gaussian density estimation. In Proc. 13th Ann. Conf. on Computational Learning Theory (COLT’00), pp. 100–106. Morgan Kaufmann Publ., 2000. ACM DL.
[26] Adrien Taylor and Francis Bach: Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. In Proc. 32th Ann. Conf. on Computational Learning Theory (COLT’19), pp. 2934–2992. MLR Press, 2019. [PMLR]. [arXiv:1902.00947]
[27] Paul Tseng: On accelerated proximal gradient methods for convex-concave optimization, 2008. Available at author’s webpage.
[28] Nisheeth K. Vishnoi: A mini-course on convex optimization, 2018. Available at author’s webpage.
[29] Andre Wibisono, Ashia C. Wilson, and Michael I. Jordan: A variational perspective on accelerated methods in optimization. Proc. Natl. Acad. Sci. USA, 113(47):E7351–E7358, 2016. [doi:10.1073/pnas.1614734113, arXiv:1603.04245]
[30] Ashia C. Wilson, Ben Recht, and Michael I. Jordan: A Lyapunov analysis of momentum methods in optimization, 2016. [arXiv:1611.02635]