# Category Archives: Optimization

## Notes on least-squares, part II

As promised at the end of Part I, I will give some intuition for the Bach and Moulines analysis of constant step size SGD for least-squares. Let us recall the setting. Let be a pair of random variables with where … Continue reading

## Bandit theory, part II

These are the lecture notes for the second part of my minicourse on bandit theory (see here for Part 1). The linear bandit problem, Auer [2002] We will mostly study the following far-reaching extension of the -armed bandit problem. Known … Continue reading

## Bandit theory, part I

This week I’m giving two 90 minutes lectures on bandit theory at MLSS Cadiz. Despite my 2012 survey with Nicolo I thought it would be a good idea to post my lectures notes here. Indeed while much of the material … Continue reading

## Notes on least-squares, part I

These are mainly notes for myself, but I figured that they might be of interest to some of the blog readers too. Comments on what is written below are most welcome! Let be a pair of random variables, and let … Continue reading

## Convex Optimization: Algorithms and Complexity

I am thrilled to announce that my short introduction to convex optimization has just came out in the Foundations and Trends in Machine Learning series (free version on arxiv). This project started on this blog in 2013 with the lecture notes “The … Continue reading

## Crash course on learning theory, part 2

It might be useful to refresh your memory on the concepts we saw in part 1 (particularly the notions of VC dimension and Rademacher complexity). In this second and last part we will discuss two of the most successful algorithm paradigms in … Continue reading

## Crash course on learning theory, part 1

This week and next week I’m giving 90 minutes lectures at MSR on the fundamentals of learning theory. Below you will find my notes for the first course, where we covered the basic setting of statistical learning theory, Glivenko-Cantelli classes, Rademacher complexity, VC … Continue reading

## A solution to bandit convex optimization

Ronen Eldan and I have just uploaded to the arXiv our newest paper which finally proves that for online learning with bandit feedback, convex functions are not much harder than linear functions. The quest for this result started in 2004 … Continue reading

## Revisiting Nesterov’s Acceleration

Nesterov’s accelerated gradient descent (AGD) is hard to understand. Since Nesterov’s 1983 paper people have tried to explain “why” acceleration is possible, with the hope that the answer would go beyond the mysterious (but beautiful) algebraic manipulations of the original … Continue reading

## Deep stuff about deep learning?

Unless you live a secluded life without internet (in which case you’re not reading those lines), odds are that you have heard and read about deep learning (such as in this 2012 article in the New York Times, or the … Continue reading