References

We provide here various references regarding the solvers implemented in Cyanure.

Accelerators

Cyanure uses two types of accelerators. The QNing approach builds upon Quasi-Newton principles and was introduced in

QNING
  1. Lin, J. Mairal and Z. Harchaoui. An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration. SIAM Journal on Optimization. 29(2), pages 1408–1443, 2019.

Catalyst uses Nesterov’s acceleration, and was introduced in

CATALYST
  1. Lin, J. Mairal and Z. Harchaoui. Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice. Journal of Machine Learning Research (JMLR). 18(212), pages 1–54, 2018.

Variance-reduced stochastic optimization algorithms

The miso algorithm was introduced in

MISO
  1. Mairal. Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning. SIAM Journal on Optimization. volume 25, number 2, pages 829–855, 2015.

It may be seen as a primal variant of the stochastic dual coordinate ascent method [SDCA]

SDCA
  1. Shalev-Shwartz, and T. Zhang . Stochastic dual coordinate ascent methods for regularized loss minimization. Journal of Machine Learning Research (JMLR), 14, 567-599. 2013.

The svrg algorithm was introduced in

SVRG
  1. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance reduction. In Advances in Neural Information Processing Systems (NIPS). 2013.

but the variant Cyanure uses (and its accelerated variant acc-svrg) were introduced in

ACC_SVRG
  1. Kulunchakov and J. Mairal. Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise. preprint arXiv:1901.08788. 2019

Sadly, Cyanure does not implement yet saga, which should nevertheless be mentioned here

SAGA
  1. Defazio, F. Bach and S. Lacoste-Julien. SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In Advances in Neural Information Processing Systems (NIPS). 2014.

Batch algorithms

Cyanure also implements ISTA and FISTA with line-search, as described in

FISTA
  1. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1), 183-202. 2009.

It is perhaps worth noting that qing-ista seems to perform always better than fista in all our experiments (see benchmark section).

Other frameworks

Even though Cyanure does not depend on it, our goal is to make it easy to use within Scikit-learn

SKLEARN
  1. Pedregosa, G. Varoquaux, A. Gramfort and others. Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct), 2825-2830. 2011.

Other solvers in our comparisons include also

LIBLINEAR

Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. Journal of machine learning research, 9(Aug), 1871-1874.

LBFGS

Nocedal, J. (1980). “Updating Quasi-Newton Matrices with Limited Storage”. Mathematics of Computation. 35 (151): 773–782