Local max-cut in smoothed polynomial time

Omer Angel, Yuval Peres, Fan Wei, and myself have just posted to the arXiv our paper showing that local max-cut is in smoothed polynomial time. In this post I briefly explain what is the problem, and I give a short proof of the previous state of the art result on this problem, which was a paper by Etscheid and Roglin showing that local max-cut is in quasi-polynomial time.

Local max-cut and smoothed analysis

Let G = (V,E) be a connected graph with n vertices and w: E\rightarrow [-1,1] be an edge weight function. The local max-cut problem asks to find a partition of the vertices \sigma: V\rightarrow \{-1,1\} whose total cut weight

    \[\frac12 \sum_{uv \in E} w(uv) \big(1-\sigma(u)\sigma(v)\big) ,\]

is locally maximal, in the sense that one cannot increase the cut weight by changing the value of \sigma at a single vertex (recall that actually finding the global maximum is NP-hard). See the papers linked to above for motivation on this problem.

There is a simple local search algorithm for this problem, sometimes referred to as “FLIP”: start from some initial \sigma_0 and iteratively flip vertices (i.e. change the sign of \sigma at a vertex) to improve the cut weight until reaching a local maximum. It is easy to build instances (\sigma_0, w) where FLIP takes exponential time, however in “practice” it seems that FLIP always converges quickly. This motivates the smoothed analysis of FLIP, that is we want to understand what is the typical number of steps for FLIP when the edge weights is perturbed by a small amount of noise. Formally we now assume that the weight on edge e \in E is given by a random variable X_e \in [-1,1] which has a density with respect to the Lebesgue measure bounded from above by \phi (for example this forbids X_e to be too close to a point mass). We assume that these random variables are independent.

Theorem(Etscheid and Roglin [2014]): With probability 1-o(1) FLIP terminates in O((\phi^c n^{c \log(n)}) steps for some universal constant c>0.

We improve this result from quasi-polynomial to polynomial, assuming that we put some noise on the interaction between every pair of vertices, or in other words assuming that the graph is complete.

Theorem(Angel, Bubeck, Peres, Wei [2016]): Let G be complete. With probability 1-o(1) FLIP terminates in O(\phi^5 n^{15.1}) steps.

I will now prove the Etscheid and Roglin result.

Proof strategy

To simplify notation let us introduce the Hamiltonian

    \[H(\sigma) =-\frac{1}{2} \sum_{uv \in E} X_{uv} \sigma(u) \sigma(v).\]

We want to find a local max of H. For any \sigma \in \{-1,1\}^V and v \in V, we denote by \sigma^{-v} the state equal to \sigma except for the coordinate corresponding to v which is flipped. For such \sigma,v there exists a vector \alpha = \alpha(\sigma,v) \in \{-1,0,1\}^E such that

    \[H(\sigma^{-v}) = H(\sigma) + \langle \alpha, X \rangle .\]

More specifically \alpha=(\alpha_{uw})_{uw \in E} is defined by

    \[\left\{\begin{array}{ll} \alpha_{uv} = \sigma(v) \sigma(u) & \forall u \neq v \\ \alpha_{uw} = 0 & \text{if} \ v \not\in \{u,w\} \end{array}\right.\]

We say that flipping a vertex is a move, and it is an improving move if the value of H strictly improves. We say that a sequence of moves L=(v_1,\hdots, v_{\ell}) is \epsilon-slow from \sigma_0 if

    \[H(\sigma_{i}) -  H(\sigma_{i-1}) \in (0, \epsilon], \forall i \in [\ell].\]

It is sufficient to show that with high probability there is no \epsilon-slow sequence with say \ell=2n and \epsilon = 1/(\phi n^{\log(n)})^c (indeed in this case after 2n \times (\phi n^{\log(n)})^c \times n^2 steps FLIP must have stopped, for otherwise the H value would exceed the maximal possible value of n^2). We will do this in in three main steps, a probability step, a linear algebra step, and a combinatorial step.

Probability step

Lemma: Let \alpha_1, \hdots, \alpha_k be k linearly independent vectors in \mathbb{Z}^E. Then one has

    \[\mathbb{P}(\forall i \in [k], \langle \alpha_i, X \rangle \in (0,\epsilon]) \leq (\phi \epsilon)^{k} .\]

Proof: The inequality follows from a simple change of variables. Let A \in \mathbb{Z}^{|E| \times E} be a full rank matrix whose first k rows are \alpha_1, \hdots, \alpha_k and it is completed so that A is the identity on the subspace orthogonal to \alpha_1, \hdots, \alpha_k. Let g be the density of A X, and f=\prod_{i=1}^{|E|} f_i the density of X. One has g(y) = |\mathrm{det}(A^{-1})| f(A^{-1} y)| and the key observation is that since A has integer coefficients, its determinant must be an integer too, and since it is non-zero one has |\mathrm{det}(A^{-1})|=|\mathrm{det}(A)^{-1}| \leq 1 . Thus one gets:

    \begin{align*} \mathbb{P}(\forall i \in [k], \langle \alpha_i, X \rangle \in (0,\epsilon]) & = \int_{(0,\epsilon]^k \times \mathbb{R}^{|E|-k}} g(y) dy \\ & \leq \int_{(0,\epsilon]^k \times \mathbb{R}^{|E|-k}} \prod_{i=1}^{|E|} f_i((A^{-1} y)_i) dy \\ & \leq (\phi \epsilon)^k \int_{\mathbb{R}^{|E|-k}} \prod_{i=k+1}^{|E|} f_i(y_i) dy = (\phi \epsilon)^k. \end{align*}

Linear algebra step

Lemma: Consider a sequence of \ell improving moves with k distinct vertices (say v_1, \hdots, v_k) that repeat at least twice in this sequence. Let \alpha_1, \hdots, \alpha_{\ell} \in \{-1,0,1\}^E be the corresponding move coefficients, and for each i \in [k] let s_i (respectively t_i) be the first (respectively second) time at which v_i moves. Then the vectors \tilde{\alpha}_i = \alpha_{s_i} + \alpha_{t_i} \in \mathbb{Z}^E, i \in [k], are linearly independent. Furthermore for any v that did not move between the times s_i and t_i one has \tilde{\alpha}_i(vv_i) = 0 (and for any e \not\ni v_i, \tilde{\alpha}_i(e) = 0).

Proof: The last sentence of the lemma is obvious. For the linear independence let \lambda \in \R^k be such that \sum_{i=1}^k \lambda_i \tilde{\alpha}_i = 0. Consider a new graph H with vertex set [k] and such that i is connected to j if v_i appears an odd number of times between the times s_j and t_j. This defines an oriented graph, however if i is connected to j but j is not connected to i then one has \tilde{\alpha}_j(v_iv_j) \in \{-2,2\} while \tilde{\alpha}_i(v_iv_j)=0 (and furthermore for any m \not\in \{i,j\}, \tilde{\alpha}_m(v_iv_j)=0) and thus \lambda_j=0. In other words we can consider a subset of [k] where H is an undirected graph, and outside of this subset \lambda is identically zero. To reduce notation we simply assume that H is undirected. Next we observe that if i and j are connected then one must have \tilde{\alpha}_j(v_iv_j)= - \tilde{\alpha}_i(v_iv_j) (this uses the fact that we look at the {\em first} consecutive times at which the vertices move) and in particular (again using that for any m \not\in \{i,j\}, \tilde{\alpha}_m(v_iv_j)=0) one must have \lambda_i=\lambda_j. Now let C be some connected component of H, and let \lambda_C be the unique value of \lambda on C. Noting that the \tilde{\alpha}‘s corresponding to different components of H have difference support (more precisely with C_E := \cup_{j \in C} \{e \ni j\} one has for any i \in C, \tilde{\alpha}_i |_{C_E^c} = 0 and for any i \not\in C, \tilde{\alpha}_i |_{C_E}=0) one obtains \lambda_C \sum_{i \in C} \tilde{\alpha}_i = 0. On the other hand since the sequence of moves is improving one must have \sum_{i \in C} \tilde{\alpha}_i \neq 0, which implies \lambda_C = 0 and finally \lambda = 0 (thus concluding the proof of the linear independence).

Combinatorial step

Lemma: Let v_1, \hdots, v_{2n} \in V. There exists \ell \in \mathbb{N} and i \in [2n - \ell] such that the number of vertices that repeat at least twice in the segment v_i, \hdots, v_{i + \ell} is at least \ell / (2 \log_2(n)).

Proof (from ABPW): Define the surplus of a sequence to be the difference between the number of elements and the number of distinct elements in the sequence. Let s_{\ell} be the maximum surplus in any segment of length \ell in v_1, \hdots, v_{2n}. Observe that s_{2n} \geq n. Let us now assume that for any segment of length \ell, the number of vertices that repeat at least twice is at most \epsilon \ell. Then one has by induction

    \[s_{2n} \leq 2 s_{n} + \epsilon 2n \leq 2 \epsilon n \log_2(n) .\]

This shows that \epsilon has to be greater than 1/(2 \log_2(n)) which concludes the proof.

Putting things together

We want to show that with \epsilon = 1/(\phi n^{\log(n)})^c one has

    \[\mathbb{P}(\exists (\sigma_0, L) \; \epsilon-\text{slow and} \; |L| = 2n) = o(1) .\]

By the combinatorial lemma we know that it is enough to show that:

    \begin{align*} & \sum_{\ell=1}^{2n} \mathbb{P}(\exists (\sigma_0, L) \; \epsilon-\text{slow}, \; |L| = \ell, \; \text{and} \\ & \text{L has at least} \; \ell/(2 \log_2(n)) \; \text{repeating vertices}) = o(1) . \end{align*}

Now using the probability lemma together with the linear algebra lemma (and observing that critically \tilde{\alpha} only depends on the value of \sigma at the vertices in L, and thus the union bound over \sigma_0 only gives a 2^{\ell} factor instead of 2^{n}) one obtains that the above probability is bounded by

    \[2^{\ell} n^{\ell} (\phi \epsilon)^{\ell/(2 \log_2(n))} ,\]

which concludes the proof of the Etscheid and Roglin result.

Note that a natural route to get a polynomial-time bound from the above proof would be to remove the \log(n) term in the combinatorial lemma but we show in our paper that this is impossible. Our result comes essentially from improvements to the linear algebra step (this is more difficult as the Etscheid and Roglin linear algebra lemma is particularly friendly for the union bound step, so we had to find another way to do the union bound).

This entry was posted in Optimization, Probability theory. Bookmark the permalink.

3 Responses to "Local max-cut in smoothed polynomial time"