We have seen that, roughly, the complexity of Interior Point Methods to optimize linear functions on a convex set (with non-empty interior) with a
-self-concordant barrier
is of order
where
is the complexity of computing the Newton direction (which can be done by computing and inverting the Hessian of
). Thus the efficiency of the method is directly related to the form of the self-concordant barrier that we can construct for
. A few rules allows us to construct simple self-concordant barriers for the important cases of LPs and SDPs:
Proposition 7 Let
be a
-self-concordant barrier on
, and
a
-self-concordant barrier on
. Then
is a
-self-concordant barrier on
.
Proposition 8 Let
be a linear map from
to
. Let
be a
-self-concordant barrier on
. Then
is a
-self-concordant barrier on
.
Proposition 9
is an
-self-concordant barrier on
, and
is an
-self-concordant barrier on
.
Note that the only non-trivial statement in the above propositions is the fact that is an
-self concordant barrier on
.
Let us see how to use these rules to construct a self-concordant barrier for LPs. First consider the following LP, which does not contain equality constraints:
We need to construct a self-concordant barrier on . Consider the linear map
, then clearly
, and thus using the composition rule and the fact that
is an
-self-concordant barrier on
, we obtain that
is an
-self-concordant barrier on
. Computing the Hessian takes
, and inverting it takes
. Thus the total complexity of IPM for this problem is
. In particular compared to the ellipsoid method we have a speed-up of
, and thus (at least theoretically) IPM is more interesting than the ellipsoid method for this problem when
.
Why did we restrict ourselves to LPs without equality constraints? Because in that case the set is either empty, or it has a non-empty interior in
, and thus the whole theory that we developed above applies directly. If we consider an LP which may contain equality constraints, that is the constraint set is of the form
, then
might have an empty interior in
. In that case the trick is to consider the subspace generated by the equality constraints, and do all the computations in that subspace (this will change the evaluation of the Newton direction!). Let us see more precisely how this works. We consider now the general LP:
Clearly by introducing slack variables this problem can be rewritten in the following form:
Note that (respectively
) in (5) corresponds to
(respectively to
) in (4). Now to solve (5) we can use the restriction of the function
to the subspace
which is an
-self-concordant barrier on
. To do everything in a clean way, we should reparametrize
as
, and do the computations of gradients and Hessians of
with this reparametrization. While this would be cleaner, there is an equivalent way to do this, which is more powerful in terms of implementation. The idea is that when we are looking for a direction
to perform a Newton step on some function
defined on a subset of
with non-empty interior, but that we are trying to optimize
over a subspace
, we can simply restrict our attention to directions that end up in
. This trick allows us to differentiate
in
, which is more elegant. The Newton direction at
is thus given by:
In other words a Newton step requires to solve an optimization problem of the form:
where ,
and rows of
linearly independent, and
. Using the KKT conditions, we obtain that at a pair of primal-dual optimal points
the gradient of the Lagrangian
must be zero, which means
, that is
. Furthermore we also have that the point
must be valid, that is
, which can be rewritten as
. Thus we can solve (6) with
(note that
) and
, and these computations can be done in
(since
). Thus we arrive at a complexity of
to solve (5), which corresponds to a complexity of
to solve (4) (slightly better results can be obtained by being more careful).
For the general SDP:
one can use the restriction of to
, which is
-self-concordant on the set of constraints in the above SDP. One needs to work out the details for the gradients and Hessian of
, but eventually one can get to an overall complexity of
, which is much more reasonable than what we obtained with the ellipsoid method!
By Chenyi February 21, 2013 - 9:59 pm
Can you please explain how does the equation above (6), by which we calculate the Newton direction h, come from? (actually I think equation (6) is of the same thing)
I think all the following words are using KKT conditions to solve h from this sub-problem, but I am not sure how do you formulate this problem.
Thank you!
By Sebastien Bubeck February 22, 2013 - 9:20 am
Equation (6) and what follows afterwards will be explained in lecture next week.