I am a computer scientist who works in machine learning and optimization. I work on several theoretical, algorithmic, and applied questions in machine learning and data science. I am interested in all aspects of optimization for ML, especially scalable convex and nonconvex optimization. I am fascinated by geometric optimization, a growing topic with lots of cool math. Beyond OPT & ML, I have a strong interest in matrix theory, differential geometry, metric geometry, probability theory, algebraic combinatorics, fixed-point theory, and several other areas in math.
Currently I am looking at applications in finance, healthcare, and smart cities & infrastructure. I am also interested in energy, materials science, education, and other areas where data driven thinking holds great promise.
[arXiv] [Google Scholar] [CV]Jan 24 | Paper:
Combinatorial Topic Models using Small–Variance Asymptotics (with Ke Jiang, Brian Kulis) Artificial Intelligence and Statistics (AISTATS) 2017 |
Jan '17 | Visiting:
Foundations of Machine Learning as a long-term participant (Simons Institute, UC Berkeley, CA) |
Dec 10 | Workshop: OPT2016: Optimization for Machine Learning (the 9th OPTML workshop at NIPS; co-organized with Francis Bach, Niao He, and Sashank Reddi) |
Dec 09 | Talk: At the Nonconvex Optimization for Machine Learning: Theory and Practice Workshop (a NIPS 2016 workshop; speaking on Taming nonconvexity via geometry) |
Dec 05 |
Tutorial: Neural Information Processing Systems (NIPS), Conference, Barcelona, Spain (Large-Scale Optimization: Beyond Stochastic Gradient Descent and Convexity with Francis Bach) [slides part 1], [slides part 2] |
Nov 04 | Talk: Data Science Seminar, Northeastern University, Boston. (on Geometric Optimization) |
Sep 13 | Talk: LIDS Seminar, MIT (in the LIDS Seminar series, on the topic of Geometric Optimization) |
Fall' 16 |
Teaching:
6.867 Machine Learning (EECS graduate course) (with Leslie Kaelbling, Tomas Lozano-Perez) |
Aug 12 |
Paper:
Markov Chain Sampling in Discrete Probabilistic Models with Constraints (with Chengtao Li, Stefanie Jegelka) Advances in Neural Information Processing Systems (NIPS) 2016 |
Aug 12 |
Paper:
Kronecker Determinantal Point Processes (with Z. Mariet) Advances in Neural Information Processing Systems (NIPS) 2016 |
Aug 12 |
Paper:
Fast stochastic optimization on Riemannian manifolds (with H. Zhang, S. Reddi) Advances in Neural Information Processing Systems (NIPS) 2016 |
Aug 12 |
Paper:
Fast Stochastic Methods for Nonsmooth Nonconvex Optimization (with Sashank Reddi, Barnabas Poczos, Alex Smola) Advances in Neural Information Processing Systems (NIPS) 2016 |
Aug 08 |
Paper:
Stochastic Frank-Wolfe Methods for Nonconvex Optimization (with Sashank Reddi, Barnabas Poczos, Alex Smola) 54th Allerton Conference on on Communication, Control, and Computing, 2016. |
Aug 08 | OPT 2016: Optimization for Machine Learning accepted to NIPS. The 9th NIPS Workshop on Optimization of Machine Learning; stay tuned for updates. (co-organized with Francis Bach, Sashank Reddi, Niao He) |
Aug 02 |
Preprint:
Markov Chain Sampling in Discrete Probabilistic Models with Constraints (with Chengtao Li, Stefanie Jegelka) |
Aug 02 |
Paper:
Riemannian Dictionary Learning and Sparse Coding for Positive Definite Matrices (with Anoop Cherian) to appear in IEEE TNNLS |
Jul 24 |
Paper:
Fast incremental method for smooth nonconvex optimization (with Sashank Reddi, Barnabas Poczos, Alex Smola). to appear in IEEE Conference on Decision and Control (CDC) 2016. |
Jul 14 |
Preprint:
Stochastic Frank-Wolfe Methods for Nonconvex Optimization (with Sashank Reddi, Barnabas Poczos, Alex Smola) |
Jul 13 |
Preprint:
Fast sampling for Strongly Rayleigh Measures with Application to Determinantal Point Processes (with Chengtao Li, Stefanie Jegelka) |
Jul 11 | At SIAM Annual Meeting 2016, Boston. Organizing Advances in large-scale optimization. |
Jun 23 |
Speaking at the Nonconvex Optimization Workshop (on Fast Nonconvex Stochastic Optimization) |
Jun 20 | At ICML 2016, NYC. See the papers below! |
May 20 |
Preprint:
Kronecker Determinantal Point Processes (with Z. Mariet) |
May 20 |
Preprint:
Fast stochastic optimization on Riemannian manifolds (with H. Zhang, S. Reddi) |
May 20 |
Preprint:
Fast Stochastic Methods for Nonsmooth Nonconvex Optimization (with Sashank Reddi, Barnabas Poczos, Alex Smola) |
May 05 |
Paper:
On the Matrix Square Root via Geometric Optimization Accepted to Electronic Journal on Linear Algebra (ELA) |
Apr 26 |
Paper:
First-order methods for geodesically convex optimization (with Hongyi Zhang). Conference on Learning Theory (COLT 2016) |
Apr 24 |
Paper:
Geometric Mean Metric Learning (with Pourya H. Zadeh, Reshad Hosseini) International Conference on Machine Learning (ICML 2016) |
Apr 24 |
Paper:
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms (with Yu-Xiang Wang, Veeranjaneyulu Sadhanala, Wei Dai, Willie Neiswanger, Eric Xing) International Conference on Machine Learning (ICML 2016) |
Apr 24 |
Paper:
Stochastic variance reduction for nonconvex optimization (with Sashank Reddi, Ahmed Hefny, Barnabas Poczos, Alexander Smola). International Conference on Machine Learning (ICML 2016) |
Apr 24 |
Paper:
Gaussian quadrature for matrix inverse forms with applications (with Chengtao Li, Stefanie Jegelka). International Conference on Machine Learning (ICML 2016) |
Apr 24 |
Paper:
Fast DPP sampling for Nyström with application to Kernel Methods (with Chengtao Li, Stefanie Jegelka). International Conference on Machine Learning (ICML 2016) |
Apr 21 |
Paper:
Entropic Metric Alignment for Correspondence Problems (with Justin Solomon, Gabriel Peyré, Vladimir Kim). Accepted to ACM SIGGRAPH 2016 |
Apr 07 |
Preprint:
Combinatorial Topic Models using Small–Variance Asymptotics (with Ke Jiang, Brian Kulis) |
Mar 18 |
Preprint:
Fast DPP sampling for Nyström with application to Kernel Methods (with Chengtao Li, Stefanie Jegelka) |
Mar 13 |
Preprint:
Fast incremental method for smooth nonconvex optimization (with Sashank Reddi, Barnabas Poczos, Alex Smola) [arXiv] |
Feb 21 |
Preprint:
First-order methods for geodesically convex optimization (with Hongyi Zhang) |
Feb 14 |
Paper:
Inference and mixture modeling with the Elliptical Gamma distribution. (with Reshad Hosseini, Lucas Theis, Matthias Bethge). Accepted to Computational Statistics and Data Analysis (CSDA) |
Feb 05 |
Preprint:
Stochastic variance reduction for nonconvex optimization (with Sashank Reddi, Ahmed Hefny, Barnabas Poczos, Alexander Smola). [arXiv] |
Feb 02 |
Paper:
Diversity Networks (with Zelda Mariet). International Conference on Learning Representations (ICLR 2016). |
2016 | Service: PC for DIFF-CVML'16 at CVPR 2016 |
2016 |
Teaching:
6.036 Introduction to Machine Learning (with Tommi Jaakkola, Regina Barzilay) |
Jan 21 |
Lecturing:
Aspects of Convex, Nonconvex, and Geometric Optimization. At Hausdorff Institute for Mathematics, Bonn, Germany. |
Jan 18 |
Paper:
Sum-of-squared logarithms inequality. (with Lev Borisov, Patrizio Neff, and Christian Thiel). Accepted to Linear Algebra and its Applications (LAA) |
2016 | Service: Area Chair for NIPS 2016, ICML 2016 |
2016 | Service: Program Committee for KDD 2016 |
Jan 12 | Book Chapter: Directional Statistics in Machine Learning: a brief review (submitted) |
Jan 04 | Visiting: Hausdorff Institute for Mathematics and participating in Math of Signal Processing |
2015 | Served as Area Chair for AISTATS 2016 |
Dec 22 | Paper: AdaDelay: Delay Adaptive Distributed Stochastic Optimization (with Adams Wei Yu, Mu Li, Alexander Smola) Accepted to Artificial Intelligence and Statistics 2016 (AISTATS'16) |
Dec 22 | Paper: Efficient Sampling for K-Determinantal Point Processes (with Chengtao Li, Stefanie Jegelka) Accepted to Artificial Intelligence and Statistics 2016 (AISTATS'16) |
Dec 22 | Book Chapter: Geometric Optimization in Machine Learning (with Reshad Hosseini) |
Dec 17 | Preprint: (update) Riemannian dictionary learning and sparse coding (with A. Cherian) |
Dec 17 | Preprint: (update) Inference and mixture modeling with the Elliptical Gamma distribution (with Reshad Hosseini, Lucas Theis, Matthias Bethge) |
Dec 16 | Preprint: (update) Matrix square roots via geometric optimization |
Dec 15 | Book Chapter: Positive Definite Matrices: Data Representation and Applications to Computer Vision (with Anoop Cherian) |
Dec 06 | Preprint: Bounds on bilinear inverse forms via Gaussian quadrature with applications (with Chengtao Li, Stefanie Jegelka) |
2015 | Served on Program Committee for SIGMOD 2016 |
2015 | Served on Program Committee for KDD 2015 |
2015 | Served as Area Chair for ICML 2015 |
Nov 17 | Preprint: Diversity Networks (with Zelda Mariet) |
Sep 18 | Preprint: Inequalities via elementary symmetric polynomial monotonicity |
Sep 07 | Preprint: Efficient structured low rank minimization (with Adams Wei Yu, Wanli Ma, Yaoliang Yu) |
Sep 07 | Preprint: Efficient Sampling for K-Determinantal Point Processes (with Chengtao Li, Stefanie Jegelka) |
Sep 07 | Paper: on Positive definite matrices and the S-Divergence to appear in Proceedings American Math Society (PAMS) |
Sept. | OPTML++: Running the OPTML++ research seminar plus reading group |
Sep 04 | Paper: on Manifold optimization for mixture models accepted to NIPS 2015 (with R. Hosseini) |
Sep 04 | Paper: Asynchronous variance reduced stochastic gradient accepted to NIPS 2015 (with with Sashank Reddi, Ahmed Hefny, Barnabas Poczos, Alexander Smola) |
Aug 20 | Preprint: on Delay sensitive distributed optimization (with Adams Wei Yu, Mu Li, Alexander Smola) |
Aug 17 | Preprint: Sum-of-squared logarithms inequality (with Lev Borisov, Patrizio Neff, and Christian Thiel) |
Aug 14 | Announcement! OPT2015: Optimization for Machine Learning, NIPS, Montreal is happening! |
Jul 29 | Preprint: Matrix square roots via geometric optimization |
Jul 13 | Talk: Conic geometric optimisation at ISMP, 2015 |
Jul 10 | Preprint: Riemannian dictionary learning and sparse coding (with Anoop Cherian) |
Jun 24 | Preprint Manifold optimization for mixture models (with Reshad Hosseini) |
Jun 24 | Paper: A proof of Thompson's determinantal inequality (with Minghua Lin) |
Jun 23 | Preprint Variance reduction in stochastic gradient and asynchronous algorithms (with S. Reddy, A. Hefny, B. Poczos, A. Smola) |
Jun 16 | Lecturing at the MSR Summer School on Machine Learning, Bangalore Lecture slides are now available |
May 17 | Paper: Inequalities for normalized Schur functions accepted to European Journal of Combinatorics (my first combinatorics paper!) |
May 12 | Paper: Operator Hlawka-like inequalities on positive definite tensors (with Wolfgang Berndt) to Linear Algebra and its Applications (LAA) |
May 12 | Paper: Efficient randomized coordinate descent algorithms for non separable constrained optimization (with Sashank Reddi, Ahmed Hefny, C. Downey, A. Dubey); Uncertainty in Artificial Intelligence (UAI 2015) |
Apr 26 | Paper: Efficiently learning determinantal point processes (with Z. Mariet). accepted to ICML'15. written during my first two weeks at MIT! |
Apr 04 | Preprint: (new version) Explicit diagonalization of an anti-triangular Cesaró matrix |
Mar 13 | Talk: Speaking about Schur functions at the MIT Combinatorics Seminar!! |
Mar 01 | Preprint: Proof of a conjecture in combinatorics: On inequalities for normalized Schur functions |
Jan 22 | Paper: Conic geometric optimisation on the manifold of positive definite matrices (with Reshad Hosseini) accepted for publication to SIAM J. Optimization |
Jan 17 | Paper: Data Modeling with the Elliptical Gamma Distribution upcoming in AISTATS 2015 (with Reshad Hosseini) |
Jan 16 | Started at MIT! |
[25.12.2014] | Looking for a candidate interested in working on a (paid) potentially high-impact and novel industrial project on machine learning for healthcare informatics. Please email me if you are interested |
[21.11.2014] | Preprint Updated version of: Conic geometric optimisation on the manifold of positive definite matrices |
[17.11.2014] | Preprint Updated version of: Hlawka-Popoviciu inequalities on positive definite tensors |
[16.11.2014] | SoftwareFast total-variation toolbox now on github! |
[29.10.2014] | New preprints
Efficient structured matrix rank minimization (convex optimization, compressed sensing)
Statistical inference with elliptical distributions (nonconvex optimization, mixture modeling) Super fast modular total-variation optimization! (see also: TV webpage) Hlawka inequalities on positive definite tensors (hypergraph cut style operator inequalities) Completely strong superadditivity of generalized matrix functions (if you like matrix submodularity!) Explicit diagonalization of a Cesaró/Markov matrix (min / max kernels, operator norms) Asynchronous Parallel Block-Coordinate Frank-Wolfe (large-scale parallel convex optimization) Randomized coordinate descent methods with linear constraints (work in progress) |
[20.09.2014] | I'm moving to MIT in January 2015! |
[29.07.2014] | Website moved to AWS |
[17.07.2014] | PreprintNew arXiv version of Conic geometric optimisation on the manifold of positive definite matrices |
[15.06.2014] | PaperRiemannian sparse coding, ECCV 2014 |
[01.06.2014] | Wow, I've already left CMU! so quickly it went by!!! |
[30.05.2014] | PaperFast Newton methods for the group fused lasso, New Uncertainty in AI (UAI 2014) |
[25.04.2014] | PaperRobust sparse hashing, IEEE Transactions on Image Processing |
[13.04.2014] | PaperRandomized nonlinear component analysis (ICML 2014) |
[15.03.2014] | Serving as an Area Chair for NIPS 2014 |
Jan 2014 | Teaching Advanced Optimization at CMU from Jan 13 onwards! |
[02.01.2014] | Serving as Associate Editor for Optimization Methods and Software |
[27.12.2013] | Preprint new arXiv version of Positive definite matrices and the S-Divergence |
[11.12.2013] | Paper Stochastic ADMM (ICML 2014) |
[10.12.2013] | Preprint new arXiv version of Positive definite matrices and the symmetric Stein divergence |
[04.12.2013] | Preprint arXiv version of S. Sra, R. Hosseini, "Conic geometric optimisation on the manifold of positive definite matrices" |
[19.11.2013] | Preprint arXiv version of S. Jegelka, F. Bach, S. Sra, "Reflection methods for submodular optimization" |
[25.10.2013] | Paper in IMA J. Numerical Analysis Correlation matrix nearness and completion under observation uncertainty |
[05.09.2013] | Serving as an Area Chair for ICML 2014 |
[05.09.2013] | Serving on the Senior PC for AISTATS 2014 |
[04.09.2013] | Paper (with S. Jegelka, F. Bach) Reflection methods for submodular optimization |
[04.09.2013] | Paper (with R. Hosseini) Geometric optimisation for positive definite matrices |
[01.09.2013] | Visiting ML Dept., School of CS, Carnegie Mellon University for 2 semesters! |
[21.05.2013] | EE227A is over! Hopefully my lecture notes will be available soon! |
[15.03.2013] | Serving as an Area Chair for NIPS 2013 |
[22.01.2013] | Teaching EE227A: Convex Optimization, EECS, UC Berkeley |
[08.12.2012] | Talk: Presented a short version of my new distance function at the NIPS Workshop on "Algebraic Topology and Machine Learning" |
[01.12.2012] | Paper Similarity computations on positive definite matrices for fast nearest neighbor search. IEEE TPAMI |
[07.09.2012] | Paper The multivariate Watson distribution: Maximum likelihood and other aspects" in J. Multivariate Analysis |
[03.09.2012] | Paper Large-scale nonconvex nonsmooth optimization |
[03.09.2012] | Paper A new distance metric on the manifold of positive definite matrices |
[24.06.2012] | Talk Speaking @ the NIMS Hot Topics Workshop on Positive Matrices and Operators at KNU, Daegu, Korea |
Jun 2013 | Award. My work on Metric Nearness was selected to receive the SIAM Outstanding Paper Prize, 2011 |
2012 | Book on Optimization for Machine Learning (co-edited with S. Nowozin and S. J. Wright; Publisher: MIT Press) is available here. Here are links to Amazon and Barnes and Noble |