Kernel-based methods for convex bandits, part 2

The goal of this second lecture is to explain how to do the variance calculation we talked about at the end of Part 1 for the case where the exponential weights distribution is non-Gaussian. We will lose here a factor $n^2$ in the regret bound compared to the optimistic calculations from Part 1, and thus by the end of this lecture we could still hope to prove a regret bound of order $n^{4.5} \sqrt{T}$ .

Gaussian core

First let us revisit the introduction of the core. Recall that we are considering a kernel $K$ such that $K \delta_x$ is the distribution of $(1- \lambda) Z + \lambda x$ for some random variable $Z$ to be defined. To control the regret we want to have the following inequality for any convex function $f$ :

$K^* f(x) \leq (1-\lambda) \langle Kp, f\rangle + \lambda f(x) ,$

which means that the distribution $q$ of the random variable $Z$ should satisfy

(1) $\begin{equation*} \langle q, f \rangle \leq \langle Kp, f\rangle . \end{equation*}$

The core is defined so that (1) is in fact an equality. As we discussed in the previous post the core is a deep mathematical construction, but unfortunately we could not find a way to generalize the Part 1’s variance calculation when the core is non-Gaussian. In what follows we describe a slightly different construction which will allow us to satisfy (1) with $q$ being Gaussian. A key observation is that if $r$ is convexly dominated by $p$ , i.e. for any convex $f$ one has $\langle r, f \rangle \leq \langle p, f\rangle$ , then (1) is satisfied by taking $q$ to be the core of $r$ :

$\langle q, f \rangle = \mathbb{E}_{X \sim q} f(X) = \mathbb{E}_{X \sim q, Y \sim r} f((1-\lambda) X + \lambda Y) \leq \mathbb{E}_{X \sim q, Y \sim p} f((1-\lambda) X + \lambda Y) = \langle Kp, f\rangle .$

Thus it suffices to show that for any $p$ that we care about one can find a Gaussian $r$ “inside” of $p$ (in the sense that $r$ is convexly dominated by $p$ ). Then instead of taking the core of $p$ to define the kernel one can take the core of $r$ (which we will also refer to as the Gaussian core of $p$ ).

Next we show that we can essentially restrict our attention to the case where $p$ is log-concave, and then we show how to find a Gaussian inside a log-concave measure.

Approximate log-concavity

Recall that in Part 1 we replaced the variance calculation by simply showing that the loss estimates $\tilde{\ell}_t(y)$ are bounded. Thus we see by induction and using Hoeffding-Azuma that with probability at least $1-\delta$ one has

$\left|\sum_{s=1}^{t-1} (\tilde{\ell}_s (y) - K_s^* \ell_s(y)) \right| \lesssim \sqrt{T \log(1/\delta)} .$

Using an union bound over an $\epsilon$ -net together with Lipschitzness of $\tilde{\ell}_s$ (which we didn’t prove but it follows the same lines than boundedness) one has in fact that with high probability (say $1-1/T^{10}$ ) there is a convex function $f$ (recall that $\mathbb{E}_{x_t \sim K_t p_t} \tilde{\ell}_t = K_t^* \ell_t$ is a convex function) such that

$\|f - \sum_{s=1}^{t-1} \tilde{\ell}_s\|_{\infty} \lesssim \sqrt{T n \log(T)} .$

In particular provided that $\eta \lesssim \frac{1}{\sqrt{T n \log(T)}}$ the above display shows that $p_{t}$ (whose density is proportional to $\exp(-\eta \sum_{s=1}^{t-1} \tilde{\ell}_s)$ ) is $O(1)$ -approximately log-concave (recall that $p$ is said to be $\epsilon$ -approximately log-concave if there is a log-concave function $q$ such that $\epsilon \leq p/q \leq 1/\epsilon$ ). To simplify the discussion in these lectures we will in fact assume that $p_t$ is exactly log-concave.

We note that the above concentration argument makes us lose a $n$ factor in the regret. Indeed in Part 1 we used $\eta \approx \sqrt{n / T}$ to (informally) obtain the regret $n^{2.5} \sqrt{T}$ (or $n^{1.5} \sqrt{T}$ with even more optimistic calculations). We see that now we are forced to take a learning rate which is $1/n$ smaller than this, which in turn multiplies the regret bound by a factor $n$ .

Finding a Gaussian inside a log-concave

Let $p$ be an isotropic log-concave measure. We will show that a centered Gaussian $G$ with covariance $\tilde{O}(\frac{1}{n}) I_n$ is (approximately) convexly dominated by $p$ . We note in passing that by going back to the calculations at the end of Part 1 it is easy to see that this factor $\tilde{O}(\frac{1}{n})$ in the covariance will in turn force us to take $\lambda$ to be $\tilde{O}(\frac{1}{n})$ smaller than what we hoped at the end of Part 1, leading to yet again an extra $n$ in the regret bound.

In fact we will show that any centered measure $q$ supported on a ball of radius $\epsilon=O(1)$ is (exactly) convexly dominated by $p$ . This implies the (approximate) Gaussian convex domination we mentioned above since most of the mass of $G$ is in such a ball (it is easy to deal with the remaining $\mathrm{poly}(1/T)$ mass but we won’t worry about it here).

Let us fix some convex function $f$ . We want to show that $\langle q, f \rangle \leq \langle p, f \rangle$ . Since both measures are centered we may add any linear function to $f$ without affecting this inequality, so we may legitimately assume that $f(0)=0$ and $f(x) \geq 0$ for all $x\in \mathbb{R}^n$ . By scaling we might also assume that the maximum on the centered ball of radius $\epsilon$ is $1$ and thus it is enough to show $\langle p, f \rangle \geq 1$ . By convexity $f$ is above the linear function $x \mapsto 1+ \nabla f(y)^{\top} (x - y)$ where $y$ is the maximizer of $f$ on the centered ball of radius $\epsilon$ . Note also that $|\nabla f(y)| \geq 1/\epsilon$ and thus it is enough to show (recall that $f \geq 0$ ):

$\langle p, x \mapsto \max(0, \theta^{\top} (x-y)) \rangle \geq \epsilon ,$

where $\theta = \nabla f(y) /|\nabla f(y)|$ , which is implied by

$\langle p, x \mapsto \max(0, \theta^{\top} x) \rangle \geq 2 \epsilon .$

It only remains to use the fact that an isotropic real log-concave random variable $X$ verifies $\mathbb{E} (X \textbf{1}\{X \geq 0\}) \approx 1$ , and thus the above display holds true with $\epsilon = O(1)$ .

Dec	MAR	Apr
	01
2019	2021	2022

Kernel-based methods for convex bandits, part 2

Leave a reply

Archives

Categories

Recent Posts

Subscribe to Blog via Email

Meta

Blogroll