Excellent work! I’ve been reading this for the past few days and have found it to be a quite helpful resource. One comment:

On page 43, in your discussion on Nesterov’s Accelerated Gradient Descent, you say that, intuitively, \Phi_s become a finer and finer approximation “from below” to f in the sense of inequality (3.17).

I found the wording here a bit confusing (“from below”, in particular), since for some values of x we have \Phi_s(x) > f(x). For those x where \Phi_s(x) > f(x), the inequality (3.17) is putting a bound on how good of an approximation \Phi_s is to f “from above”.

Erik

]]>