Recall from the previous lecture that we can ‘solve’ the problem (with a convex body in and a convex function) in ‘time’ where is the computational complexity of finding a separating hyperplane from at , and computing the and order oracles for .
We will discuss now two important classes of convex problems with the above form, and we will see what is the computational complexity to solve these problems with the ellipsoid method.
Linear Programming (LP)
The class LP consists of problems where for some , and for some and . As you have seen in ORF522, there is a plethora of real life problems that can be formulated as LPs.
Let us see what is the computational complexity of solving LPs with the ellipsoid method. Clearly computing the and order oracles can be done in . Now let us see what is the complexity of finding a separating hyperplane between and . Let be the row of , then one simply need to find such that (such an exists since ), and this gives a separating hyperplane (since for any , one has ). Thus the complexity of finding a separating hyperplane is . In total this gives a complexity of order to find an approximate solution with the ellipsoid method.
Historically a great breakthrough happened when Khachiyan showed in 1979 that in fact the ellipsoid method (with a clever rounding technique) can solve LPs exactly in polynomial time. As far as we are concerned we are satisfied with an -optimal point, and we will not discuss rounding techniques.
Semidefinite Programming (SDP)
The class SDP consists of problems where the optimization variable is a symmetric matrix . Let be the space of symmetric matrices (respectively is the space of positive semi-definite matrices), and let be the Frobenius inner product (recall that it can be written as ). In the class SDP the problems are of the following form: for some , and for some and .
The expressive power of SDP is not obvious at first sight, but we will see some wonderful examples later on in the course. Bear with me for the moment, SDP is an important class of problems.
The difference between SDP and LP lies in the constraint that . Indeed this constraint can be written as . That is, behind the notation there is in fact aninfinite number of linear constraints. The beauty of linear algebra is that this infinite set of constraints can be checked in finite time: one simply need to compute the spectrum of (which can be done in time with standard techniques), and verify that all the eigenvalues are non-negative! This fact also allows us to find separating hyperplanes. Indeed let . First one can check each inequality in time , and if one of them is violated then one gets a separating hyperplane immediately (as in the LP case). But one can also check in time , and if then an eigenvector associated to a negative eigenvalue gives a separating hyperplane. Thus here , and the computional complexity of the ellipsoid method for SDP is .
Conic Programming
The two above examples are specific instances of Conic Programming problems. In Conic Programming one first selects a cone (which is a set such that ), and then one replaces the condition in the definition of LP by . In LP the cone of interest is the positive orthant , while in SDP it is a product of positive semi-definite cones . Other cones are of interest, for instance with the Lorentz cone:
one gets the class of Second Order Conic Programming (SOCP) problems. An important part of Mathematical Programming is to recognize when a problem can be reformulated as a Conic Program. In this course we will not discuss this aspect of mathematical optimization too much, and the interested reader is referred to the excellent book Convex Optimization by Boyd and Vandenberghe.
Some ‘practical’ considerations
Let us consider a ‘small’ LP with . Then , and thus assuming that we have a computer that can do of order of ‘elementary operations’ per second, one could solve this LP with the ellipsoid method in about seconds, which is more than days. This is very unsatisfactory, especially since you have seen in ORF522 that the simplex method can handle much larger LPs. Furthermore this disappointing fact is not only theoretical: the ellipsoid method does not perform well in practice.
In the next lectures we will see another class of algorithms, called Interior Point Methods, which enjoy both the nice theoretical properties of the ellipsoid method, and that are also very efficient in practice. These algorithms date back to an original proposition of Karmakar in 1984, and they fundamentally changed the landscape of convex optimization. While the theoretical improvement in terms of computational complexity was not very impressive (we shall see that IPM can solve LP in , instead of with the ellipsoid method), the practical efficiency of these algorithms was unprecedented at the time. In particular IPM are also efficient to solve SDP problems: while the ellipsoid method cannot deal with SDP with more than a few tens of variables, IPM can go up to thousands of variables.
By Sebastien Bubeck February 17, 2013 - 10:18 am
Haipeng: with the notation of the SDP section, n_i = n, and i ranges from 0 to m. The part captures the constraint that , and the other cones capture the linear constraints on (these cones could have been written as orthants, but the SDP cone is more general as the orthant cone is the SDP cone restricted to diagonal matrices).
I understand that this is a bit confusing. Next lecture we will see how to apply Interior Point Methods to LP and SDP, so we will go over this again. In particular we will do what Anonymous suggested in the previous comment: we will rewrite LPs and SDPs with equality constraints instead of inequalities (this is done by introducing slack variables).
By Haipeng February 13, 2013 - 12:08 pm
I am having a hard time understanding why SDP is CP with the cone mentioned above… what are those superscripts n_i ?
By Anonymous February 13, 2013 - 9:10 pm
I think conic programming is usually defined by linear objective, linear constraints, and X is in a cone.
In both LP and SDP, you have linear equality/inequality constraints. In LP, your cone is R^n_+, while in SDP, your cone is S^n_+. For more details and examples, see
http://people.orie.cornell.edu/miketodd/iccopt.pdf