# Category Archives: Optimization

## Convex Optimization: Algorithms and Complexity

I am thrilled to announce that my short introduction to convex optimization has just came out in the Foundations and Trends in Machine Learning series (free version on arxiv). This project started on this blog in 2013 with the lecture notes “The … Continue reading

## Crash course on learning theory, part 2

It might be useful to refresh your memory on the concepts we saw in part 1 (particularly the notions of VC dimension and Rademacher complexity). In this second and last part we will discuss two of the most successful algorithm paradigms in … Continue reading

## Crash course on learning theory, part 1

This week and next week I’m giving 90 minutes lectures at MSR on the fundamentals of learning theory. Below you will find my notes for the first course, where we covered the basic setting of statistical learning theory, Glivenko-Cantelli classes, Rademacher complexity, VC … Continue reading

## A solution to bandit convex optimization

Ronen Eldan and I have just uploaded to the arXiv our newest paper which finally proves that for online learning with bandit feedback, convex functions are not much harder than linear functions. The quest for this result started in 2004 … Continue reading

## Revisiting Nesterov’s Acceleration

Nesterov’s accelerated gradient descent (AGD) is hard to understand. Since Nesterov’s 1983 paper people have tried to explain “why” acceleration is possible, with the hope that the answer would go beyond the mysterious (but beautiful) algebraic manipulations of the original … Continue reading

## Deep stuff about deep learning?

Unless you live a secluded life without internet (in which case you’re not reading those lines), odds are that you have heard and read about deep learning (such as in this 2012 article in the New York Times, or the … Continue reading

## The entropic barrier: a simple and optimal universal self-concordant barrier

Ronen Eldan and I have just uploaded our new paper on the arxiv (it should appear tomorrow, for the moment you can see it here). The abstract reads as follows: We prove that the Fenchel dual of the log-Laplace transform … Continue reading

## Komlos conjecture, Gaussian correlation conjecture, and a bit of machine learning

Today I would like to talk (somewhat indirectly) about a beautiful COLT 2014 paper by Nick Harvey and Samira Samadi. The problem studied in this paper goes as follows: imagine that you have a bunch of data points in with a certain … Continue reading

## Theory of Convex Optimization for Machine Learning

I am extremely happy to release the first draft of my monograph based on the lecture notes published last year on this blog. (Comments on the draft are welcome!) The abstract reads as follows: This monograph presents the main mathematical … Continue reading

## Nesterov’s Accelerated Gradient Descent for Smooth and Strongly Convex Optimization

About a year ago I described Nesterov’s Accelerated Gradient Descent in the context of smooth optimization. As I mentioned previously this has been by far the most popular post on this blog. Today I have decided to revisit this post to give a … Continue reading