# jemdoc: menu{MENU}{index.html}, showsource
= OPTML\+\+
[http://suvrit.de/ Suvrit Sra] ([suvrit@mit.edu])
== Introduction
~~~
Welcome to OPTML\+\+! This is the homepage for the research seminar plus reading group on: *O*ptimization for *M*achine *L*earning (the (\+\+) stands for related areas (*statistics*, *signal processing*, *computer vision*, *robotics*, *information theory*, *engineering*, among others).
OPTML\+\+ covers topics from convex, nonconvex, continuous and combinatorial optimization. This reading group is a mixture of a research seminar, a journal club, and a problem solving group.
This is our second semester. While the aim for the first semester was to help participants gain a substantial foothold in OPTML\+\+ or to enrich their existing expertise, this semester we will begin to focus more on research papers, open questions, and the sort, while covering important classical and contemporary material as need be.
~~~
== Organization
~~~
We are resuming our weekly meeting beginning *March 8th, 2016*\n
Next Meeting: 5:15PM; Location: 32-D707 *Apr 13, 2016*
The admin, announcements, and discussion related to this group are on [http://mailman.mit.edu/mailman/listinfo/optiml this mailing list]. Please subscribe.
~~~
== Schedule
~~~
{} {table}{lect}
Date | Topic | Links ||
08 Mar 2016| /Suvrit/: Introduction, Plans, Metric Nearness | [optml2016_meet_1.pdf Intro]; [optml16_mn.pdf Metric Nearness] ||
13 Apr 2016| /Suvrit/: Nonconvex variance reduced gradient methods|[http://arxiv.org/abs/1603.06160 1]; [http://arxiv.org/abs/1603.06159 2]||
~~~
== Suggested Papers
. Rodomanov, Kropotov (ICML 2016). Superlinearly-Convergent Proximal Newton-type Method for the Optimization of Finite Sums
. De Sa, Re, Olukotun (ICML 2016). Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling
. F. Curtis (ICML 2016). A Self-Correcting Variable-Metric Algorithm for Stochastic Optimization ,Frank Curtis Lehigh University
. Reddi et al (Feb 2016). [http://arxiv.org/abs/1603.06159 Fast Incremental Method for Nonconvex Optimization]
. Reddi et al (Feb 2016). [http://arxiv.org/abs/1603.06160 Stochastic Variance Reduction for Nonconvex Optimization]
. Reddi et al (Jun 2015) [http://arxiv.org/abs/1506.06840 Asynchronous variance reduced stochastic gradient]
. Mania et al (2015). [http://arxiv.org/abs/1507.06970 Perturbed Iterate Analysis for Asynchronous Stochastic Optimization]
. De Sa et al (Oct 2015). [http://arxiv.org/abs/1506.06438 Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms]
. Wibisono et al (Mar 2016). [http://arxiv.org/abs/1603.04245v1 A Variational Perspective on Accelerated Methods in Optimization]
. Montanari (Mar 2016). [http://arxiv.org/pdf/1603.04064v1.pdf A Grothendieck-type inequality for local maxima]
. Agarwal, Bullins, Hazan (Mar 2016). [http://arxiv.org/abs/1602.03943 Second order stochastic optimization in linear time]
. Luo, Agarwal, Cesa-Bianchi, Langford (Feb 2016). [http://arxiv.org/abs/1602.02202 Efficient Second Order Online Learning via Sketching]
. H. Zhang, S. Sra (Feb 2016). First-order algorithms for geodesically convex optimization
. Y. Zhang, Lee, Jordan. (Oct 2015) [http://arxiv.org/abs/1510.03528v1 ℓ1-regularized Neural Networks are Improperly Learnable in Polynomial Time]
methods
. Allen-Zhu, Orecchia. [http://arxiv.org/abs/1407.1537 Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent]
. Su, Boyd, Candes. [http://arxiv.org/abs/1503.01243 A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights]
. Taylor, Hendrickx, Glineur (2015). Exact Worst-case Performance of First-order Algorithms for Composite Convex Optimization
. Kim, Fessler (2014-16). [http://arxiv.org/abs/1406.5468 Optimized first-order methods for smooth convex minimization]
== POTENTIAL TOPICS
~~~
{}{raw}
- Basic convex analysis and optimization
- Basics of convex sets and functions
- Key convex functions
- Constructing and recognizing convex functions
- Some convex geometric problems

- Scalable convex optimization
- First-order methods
- Parallel
- Distributed
- Online learning -- maybe Sasha Rakhlin can help!
- Sublinear methods
- Incremental gradient methods
- Stochastic gradient methods
- Large-scale Semidefinite programming

- Scalable nonconvex optimization
- online optimization
- dictionary learning models and algorithms
- deep learning algorithms
- nonconvex SGD, incremental gradient, and other algorithms

- Non-Euclidean optimization
- Hadamard manifolds
- Riemannian manifolds
- Metric spaces of nonpositive curvature
- Wasserstein metric based problems (probability, geometry)
- Explicit examples, such as large scale Gaussian mixtures

- Interplay of convex and combinatorial optimization
- Submodular problems -- potentially led by Stefanie Jegelka
- Max-flow, graph algorithms -- potentially led by Aleksander Mądry
- Convex relaxations

- Applications
- Large-scale regression problems
- Optimization in deep learning
- Dictionary learning
- Applications from machine learning, statistics
- Applications from finance, signal processing, control, information theory, ...

- Software, design and implementation
- Data Science platforms: AWS, Azure, Spark, MLBase, Parameter Server, Dato
- Lessons in implementing asynchronous optimization algorithms
- Multicore CPU considerations
- GPU and GPGPU considerations
- Delay sensitive algorithms and design
- Other useful software and hardware ideas

- Sublinear algorithms, randomized methods
- Randomized linear algebra
- Property testing
- Hashing, SimHash, dimensionality reduction, etc.
- Approximation algorithms

- Exploiting symmetry in optimization
- Optimizing over permutations
- Schur-convexity, majorization
- Optimizing over groups
- Scattering transforms, application to deep learning

- Large-scale sampling, MCMC
- Parallel Gibbs sampling procedures
- Large-scale multivariate gaussians
- MCMC via the optimization lens
- Scalable determinantal sampling

- Open problems in optimization, ML, statistics, Information theory
- Entropy inequalities
- Convexity conjectures
- Global optimization problems

~~~