ORF523: LP, SDP, and Conic Programming

Recall from the previous lecture that we can ‘solve’ the problem {\min_{x \in \mathcal{X}} f(x)} (with {\mathcal{X}} a convex body in {{\mathbb R}^n} and {f} a convex function) in ‘time’ {O( \max(M, n^2) n^2 )} where {M} is the computational complexity of finding a separating hyperplane from {\mathcal{X}} at {x \not\in \mathcal{X}}, and computing the {0^{th}} and {1^{st}} order oracles for {f}.

We will discuss now two important classes of convex problems with the above form, and we will see what is the computational complexity to solve these problems with the ellipsoid method.

Linear Programming (LP)

The class LP consists of problems where {f(x) = c^{\top} x} for some {c \in {\mathbb R}^n}, and {\mathcal{X} = \{x \in {\mathbb R}^n : A x \leq b \}} for some {A \in {\mathbb R}^{m \times n}} and {b \in {\mathbb R}^m}. As you have seen in ORF522, there is a plethora of real life problems that can be formulated as LPs.

Let us see what is the computational complexity of solving LPs with the ellipsoid method. Clearly computing the {0^{th}} and {1^{st}} order oracles can be done in {O(n)}. Now let us see what is the complexity of finding a separating hyperplane between {x \not\in \mathcal{X}} and {\mathcal{X}}. Let {A_i} be the {i^{th}} row of {A}, then one simply need to find {i \in \{1, \hdots, m\}} such that {A_i^{\top} x > b_i} (such an {i} exists since {x \not\in \mathcal{X}}), and this {A_i} gives a separating hyperplane (since for any {y \in \mathcal{X}}, one has {A_i^{\top} y \leq b_i}). Thus the complexity of finding a separating hyperplane is {O(m n)}. In total this gives a complexity of order {O( \max(m, n) n^3 )} to find an approximate solution with the ellipsoid method.

Historically a great breakthrough happened when Khachiyan showed in 1979 that in fact the ellipsoid method (with a clever rounding technique) can solve LPs exactly in polynomial time. As far as we are concerned we are satisfied with an {\epsilon}-optimal point, and we will not discuss rounding techniques.

Semidefinite Programming (SDP)

The class SDP consists of problems where the optimization variable is a symmetric matrix {X \in {\mathbb R}^{n \times n}}. Let {\mathbb{S}^n} be the space of {n\times n} symmetric matrices (respectively {\mathbb{S}^n_+} is the space of positive semi-definite matrices), and let {\langle \cdot, \cdot \rangle} be the Frobenius inner product (recall that it can be written as {\langle A, B \rangle = \mathrm{tr}(AB)}). In the class SDP the problems are of the following form: {f(x) = \langle X, C \rangle} for some {C \in {\mathbb R}^{n \times n}}, and {\mathcal{X} = \{X \in \mathbb{S}^n_+ : \langle X, A_i \rangle \leq b_i, i \in \{1, \hdots, m\} \}} for some {A_1, \hdots, A_m \in {\mathbb R}^{n \times n}} and {b \in {\mathbb R}^m}.

The expressive power of SDP is not obvious at first sight, but we will see some wonderful examples later on in the course. Bear with me for the moment, SDP is an important class of problems.

The difference between SDP and LP lies in the constraint that {X \in \mathbb{S}^n_+}. Indeed this constraint can be written as {\forall x \in {\mathbb R}^n, \langle X, xx^{\top} \rangle \geq 0}. That is, behind the notation {X \in \mathbb{S}^n_+} there is in fact aninfinite number of linear constraints. The beauty of linear algebra is that this infinite set of constraints can be checked in finite time: one simply need to compute the spectrum of {X} (which can be done in time {O(n^3)} with standard techniques), and verify that all the eigenvalues are non-negative! This fact also allows us to find separating hyperplanes. Indeed let {X \not\in \mathcal{X}}. First one can check each inequality {\langle X, A_i \rangle \leq b_i} in time {O(n^2)}, and if one of them is violated then one gets a separating hyperplane immediately (as in the LP case). But one can also check {X \in \mathbb{S}^n_+} in time {O(n^3)}, and if {X \not\in \mathbb{S}^n_+} then an eigenvector associated to a negative eigenvalue gives a separating hyperplane. Thus here {M= O(\max(m,n) n^2)}, and the computional complexity of the ellipsoid method for SDP is {O( \max(m, n^2) n^6 )}.

Conic Programming

The two above examples are specific instances of Conic Programming problems. In Conic Programming one first selects a cone {\mathcal{K} \subset {\mathbb R}^n} (which is a set such that {\forall \gamma_1, \gamma_2 \geq 0, x_1, x_2 \in \mathcal{K}, \gamma_1 x_1 + \gamma_2 x_2 \in \mathcal{K}}), and then one replaces the condition {A x \leq b} in the definition of LP by {A x - b \in \mathcal{K}}. In LP the cone of interest is the positive orthant {\mathcal{K} = {\mathbb R}_+^m}, while in SDP it is a product of positive semi-definite cones {\mathbb{S}_+^{n_i}}. Other cones are of interest, for instance with the Lorentz cone:

\displaystyle \mathcal{K} = \{(x,t) \in {\mathbb R}^n \times {\mathbb R} : \|x\| \geq t\} ,

one gets the class of Second Order Conic Programming (SOCP) problems. An important part of Mathematical Programming is to recognize when a problem can be reformulated as a Conic Program. In this course we will not discuss this aspect of mathematical optimization too much, and the interested reader is referred to the excellent book Convex Optimization by Boyd and Vandenberghe.

Some ‘practical’ considerations

Let us consider a ‘small’ LP with {m \leq n \sim 1000}. Then {n^4 \sim 10^{12}}, and thus assuming that we have a computer that can do of order of {10^6} ‘elementary operations’ per second, one could solve this LP with the ellipsoid method in about {10^6} seconds, which is more than {10} days. This is very unsatisfactory, especially since you have seen in ORF522 that the simplex method can handle much larger LPs. Furthermore this disappointing fact is not only theoretical: the ellipsoid method does not perform well in practice.

In the next lectures we will see another class of algorithms, called Interior Point Methods, which enjoy both the nice theoretical properties of the ellipsoid method, and that are also very efficient in practice. These algorithms date back to an original proposition of Karmakar in 1984, and they fundamentally changed the landscape of convex optimization. While the theoretical improvement in terms of computational complexity was not very impressive (we shall see that IPM can solve LP in {O(n^{3.5})}, instead of {O(n^4)} with the ellipsoid method), the practical efficiency of these algorithms was unprecedented at the time. In particular IPM are also efficient to solve SDP problems: while the ellipsoid method cannot deal with SDP with more than a few tens of variables, IPM can go up to thousands of variables.

This entry was posted in Optimization. Bookmark the permalink.

3 Responses to "ORF523: LP, SDP, and Conic Programming"

Leave a reply