Local max-cut in smoothed polynomial time

Omer Angel, Yuval Peres, Fan Wei, and myself have just posted to the arXiv our paper showing that local max-cut is in smoothed polynomial time. In this post I briefly explain what is the problem, and I give a short proof of the previous state of the art result on this problem, which was a paper by Etscheid and Roglin showing that local max-cut is in quasi-polynomial time.

Local max-cut and smoothed analysis

Let $G = (V,E)$ be a connected graph with $n$ vertices and $w: E\rightarrow [-1,1]$ be an edge weight function. The local max-cut problem asks to find a partition of the vertices $\sigma: V\rightarrow \{-1,1\}$ whose total cut weight

$\frac12 \sum_{uv \in E} w(uv) \big(1-\sigma(u)\sigma(v)\big) ,$

is locally maximal, in the sense that one cannot increase the cut weight by changing the value of $\sigma$ at a single vertex (recall that actually finding the global maximum is NP-hard). See the papers linked to above for motivation on this problem.

There is a simple local search algorithm for this problem, sometimes referred to as “FLIP”: start from some initial $\sigma_0$ and iteratively flip vertices (i.e. change the sign of $\sigma$ at a vertex) to improve the cut weight until reaching a local maximum. It is easy to build instances $(\sigma_0, w)$ where FLIP takes exponential time, however in “practice” it seems that FLIP always converges quickly. This motivates the smoothed analysis of FLIP, that is we want to understand what is the typical number of steps for FLIP when the edge weights is perturbed by a small amount of noise. Formally we now assume that the weight on edge $e \in E$ is given by a random variable $X_e \in [-1,1]$ which has a density with respect to the Lebesgue measure bounded from above by $\phi$ (for example this forbids $X_e$ to be too close to a point mass). We assume that these random variables are independent.

Theorem(Etscheid and Roglin [2014]): With probability $1-o(1)$ FLIP terminates in $O((\phi^c n^{c \log(n)})$ steps for some universal constant $c>0$ .

We improve this result from quasi-polynomial to polynomial, assuming that we put some noise on the interaction between every pair of vertices, or in other words assuming that the graph is complete.

Theorem(Angel, Bubeck, Peres, Wei [2016]): Let $G$ be complete. With probability $1-o(1)$ FLIP terminates in $O(\phi^5 n^{15.1})$ steps.

I will now prove the Etscheid and Roglin result.

Proof strategy

To simplify notation let us introduce the Hamiltonian

$H(\sigma) =-\frac{1}{2} \sum_{uv \in E} X_{uv} \sigma(u) \sigma(v).$

We want to find a local max of $H$ . For any $\sigma \in \{-1,1\}^V$ and $v \in V$ , we denote by $\sigma^{-v}$ the state equal to $\sigma$ except for the coordinate corresponding to $v$ which is flipped. For such $\sigma,v$ there exists a vector $\alpha = \alpha(\sigma,v) \in \{-1,0,1\}^E$ such that

$H(\sigma^{-v}) = H(\sigma) + \langle \alpha, X \rangle .$

More specifically $\alpha=(\alpha_{uw})_{uw \in E}$ is defined by

$\left\{\begin{array}{ll} \alpha_{uv} = \sigma(v) \sigma(u) & \forall u \neq v \\ \alpha_{uw} = 0 & \text{if} \ v \not\in \{u,w\} \end{array}\right.$

We say that flipping a vertex is a move, and it is an improving move if the value of $H$ strictly improves. We say that a sequence of moves $L=(v_1,\hdots, v_{\ell})$ is $\epsilon$ -slow from $\sigma_0$ if

$H(\sigma_{i}) - H(\sigma_{i-1}) \in (0, \epsilon], \forall i \in [\ell].$

It is sufficient to show that with high probability there is no $\epsilon$ -slow sequence with say $\ell=2n$ and $\epsilon = 1/(\phi n^{\log(n)})^c$ (indeed in this case after $2n \times (\phi n^{\log(n)})^c \times n^2$ steps FLIP must have stopped, for otherwise the $H$ value would exceed the maximal possible value of $n^2$ ). We will do this in in three main steps, a probability step, a linear algebra step, and a combinatorial step.

Probability step

Lemma: Let $\alpha_1, \hdots, \alpha_k$ be $k$ linearly independent vectors in $\mathbb{Z}^E$ . Then one has

$\mathbb{P}(\forall i \in [k], \langle \alpha_i, X \rangle \in (0,\epsilon]) \leq (\phi \epsilon)^{k} .$

Proof: The inequality follows from a simple change of variables. Let $A \in \mathbb{Z}^{|E| \times E}$ be a full rank matrix whose first $k$ rows are $\alpha_1, \hdots, \alpha_k$ and it is completed so that $A$ is the identity on the subspace orthogonal to $\alpha_1, \hdots, \alpha_k$ . Let $g$ be the density of $A X$ , and $f=\prod_{i=1}^{|E|} f_i$ the density of $X$ . One has $g(y) = |\mathrm{det}(A^{-1})| f(A^{-1} y)|$ and the key observation is that since $A$ has integer coefficients, its determinant must be an integer too, and since it is non-zero one has $|\mathrm{det}(A^{-1})|=|\mathrm{det}(A)^{-1}| \leq 1 .$ Thus one gets:

$\begin{align*} \mathbb{P}(\forall i \in [k], \langle \alpha_i, X \rangle \in (0,\epsilon]) & = \int_{(0,\epsilon]^k \times \mathbb{R}^{|E|-k}} g(y) dy \\ & \leq \int_{(0,\epsilon]^k \times \mathbb{R}^{|E|-k}} \prod_{i=1}^{|E|} f_i((A^{-1} y)_i) dy \\ & \leq (\phi \epsilon)^k \int_{\mathbb{R}^{|E|-k}} \prod_{i=k+1}^{|E|} f_i(y_i) dy = (\phi \epsilon)^k. \end{align*}$

Linear algebra step

Lemma: Consider a sequence of $\ell$ improving moves with $k$ distinct vertices (say $v_1, \hdots, v_k$ ) that repeat at least twice in this sequence. Let $\alpha_1, \hdots, \alpha_{\ell} \in \{-1,0,1\}^E$ be the corresponding move coefficients, and for each $i \in [k]$ let $s_i$ (respectively $t_i$ ) be the first (respectively second) time at which $v_i$ moves. Then the vectors $\tilde{\alpha}_i = \alpha_{s_i} + \alpha_{t_i} \in \mathbb{Z}^E$ , $i \in [k]$ , are linearly independent. Furthermore for any $v$ that did not move between the times $s_i$ and $t_i$ one has $\tilde{\alpha}_i(vv_i) = 0$ (and for any $e \not\ni v_i, \tilde{\alpha}_i(e) = 0$ ).

Proof: The last sentence of the lemma is obvious. For the linear independence let $\lambda \in \R^k$ be such that $\sum_{i=1}^k \lambda_i \tilde{\alpha}_i = 0$ . Consider a new graph $H$ with vertex set $[k]$ and such that $i$ is connected to $j$ if $v_i$ appears an odd number of times between the times $s_j$ and $t_j$ . This defines an oriented graph, however if $i$ is connected to $j$ but $j$ is not connected to $i$ then one has $\tilde{\alpha}_j(v_iv_j) \in \{-2,2\}$ while $\tilde{\alpha}_i(v_iv_j)=0$ (and furthermore for any $m \not\in \{i,j\}, \tilde{\alpha}_m(v_iv_j)=0$ ) and thus $\lambda_j=0$ . In other words we can consider a subset of $[k]$ where $H$ is an undirected graph, and outside of this subset $\lambda$ is identically zero. To reduce notation we simply assume that $H$ is undirected. Next we observe that if $i$ and $j$ are connected then one must have $\tilde{\alpha}_j(v_iv_j)= - \tilde{\alpha}_i(v_iv_j)$ (this uses the fact that we look at the {\em first} consecutive times at which the vertices move) and in particular (again using that for any $m \not\in \{i,j\}, \tilde{\alpha}_m(v_iv_j)=0$ ) one must have $\lambda_i=\lambda_j$ . Now let $C$ be some connected component of $H$ , and let $\lambda_C$ be the unique value of $\lambda$ on $C$ . Noting that the $\tilde{\alpha}$ ‘s corresponding to different components of $H$ have difference support (more precisely with $C_E := \cup_{j \in C} \{e \ni j\}$ one has for any $i \in C, \tilde{\alpha}_i |_{C_E^c} = 0$ and for any $i \not\in C, \tilde{\alpha}_i |_{C_E}=0$ ) one obtains $\lambda_C \sum_{i \in C} \tilde{\alpha}_i = 0$ . On the other hand since the sequence of moves is improving one must have $\sum_{i \in C} \tilde{\alpha}_i \neq 0$ , which implies $\lambda_C = 0$ and finally $\lambda = 0$ (thus concluding the proof of the linear independence).

Combinatorial step

Lemma: Let $v_1, \hdots, v_{2n} \in V$ . There exists $\ell \in \mathbb{N}$ and $i \in [2n - \ell]$ such that the number of vertices that repeat at least twice in the segment $v_i, \hdots, v_{i + \ell}$ is at least $\ell / (2 \log_2(n))$ .

Proof (from ABPW): Define the surplus of a sequence to be the difference between the number of elements and the number of distinct elements in the sequence. Let $s_{\ell}$ be the maximum surplus in any segment of length $\ell$ in $v_1, \hdots, v_{2n}$ . Observe that $s_{2n} \geq n$ . Let us now assume that for any segment of length $\ell$ , the number of vertices that repeat at least twice is at most $\epsilon \ell$ . Then one has by induction

$s_{2n} \leq 2 s_{n} + \epsilon 2n \leq 2 \epsilon n \log_2(n) .$

This shows that $\epsilon$ has to be greater than $1/(2 \log_2(n))$ which concludes the proof.

Putting things together

We want to show that with $\epsilon = 1/(\phi n^{\log(n)})^c$ one has

$\mathbb{P}(\exists (\sigma_0, L) \; \epsilon-\text{slow and} \; |L| = 2n) = o(1) .$

By the combinatorial lemma we know that it is enough to show that:

$\begin{align*} & \sum_{\ell=1}^{2n} \mathbb{P}(\exists (\sigma_0, L) \; \epsilon-\text{slow}, \; |L| = \ell, \; \text{and} \\ & \text{L has at least} \; \ell/(2 \log_2(n)) \; \text{repeating vertices}) = o(1) . \end{align*}$

Now using the probability lemma together with the linear algebra lemma (and observing that critically $\tilde{\alpha}$ only depends on the value of $\sigma$ at the vertices in $L$ , and thus the union bound over $\sigma_0$ only gives a $2^{\ell}$ factor instead of $2^{n}$ ) one obtains that the above probability is bounded by

$2^{\ell} n^{\ell} (\phi \epsilon)^{\ell/(2 \log_2(n))} ,$

which concludes the proof of the Etscheid and Roglin result.

Note that a natural route to get a polynomial-time bound from the above proof would be to remove the $\log(n)$ term in the combinatorial lemma but we show in our paper that this is impossible. Our result comes essentially from improvements to the linear algebra step (this is more difficult as the Etscheid and Roglin linear algebra lemma is particularly friendly for the union bound step, so we had to find another way to do the union bound).

2 Responses to "Local max-cut in smoothed polynomial time"

By Gil Kalai October 24, 2016 - 12:16 pm

Impressive, congratulations!
- By Sebastien Bubeck October 24, 2016 - 5:00 pm
  
  Thanks Gil! I think there are lots of cool (and difficult) open problems following this work. The one I prefer is to show that in fact local max cut is in smoothed quasi-linear time.

Local max-cut in smoothed polynomial time

2 Responses to "Local max-cut in smoothed polynomial time"

By Gil Kalai October 24, 2016 - 12:16 pm

By Sebastien Bubeck October 24, 2016 - 5:00 pm

Leave a reply

Archives

Categories

Recent Posts

Subscribe to Blog via Email

Meta

Blogroll