So far we have seen several algorithms to solve various optimization problems. For example for LPs we have seen that the ellipsoid method has a computational complexity of roughly , and we improved this bound with Interior Point Methods which have a complexity of roughly . But how can we be sure that we are not missing some amazing algorithm with complexity? Proving that such an algorithm does not exist is a task that falls within the realm of lower bounds.
It is often said that lower bounds are more difficult than upper bounds. This is not always true (though it will be bitter true here…). However what is certainly true is that lower bounds are conceptually more difficult than upper bounds. Indeed when one wants to talk about impossibility results, one need to have the ‘rules of the game’ carefully formalized. In particular for us this means defining precisely what we mean by ‘computation’.
Thus our first task is to have a formal model of computation, and this is achieved via Turing Machines. Once the model is well defined we discuss the state of computational lower bounds within that framework (and as we will see, very little is known).
This post is simply a rewriting of parts of Chapter 1 of the book ‘Computational Complexity’ by Arora and Barak.
A Turing Machine is a pair , where is a finite set that contains the set , and is a mapping from to . The set is called the set of states of , and the mapping is the transition function of .
A Turing Machine can be applied to an input in the following way. Imagine an infinite tape indexed by . On the location is the start symbol , then the string is written on the next locations (from location to location ), and in all other locations the blank symbol is written. The Turing Machine also has a head that starts on the location. Finally the state of is initialized to . The conditions we just described form the start configuration of on . Now the Turing Machine iteratively applies the transition function as follows. Assume that the Turing Machine is in state , that its head is in location , and that the symbol written on the tape at location is . Then the Turing Machine changes its state to , it replaces the symbol at location by , and its head moves to location if or to location if (we assume that when the head reads a symbol it always move right and it does not modify the symbol).
We define the output of on as the string in which is written on the tape when the Turing Machine reaches the state . We denote this output by (note that this output is only well defined if reaches the state in finite time).
What do we do with Turing Machines? Well we use them to compute things, and in particular to compute mappings from to itself. Let be such a mapping. We say that computes if for any , one has . Let , we say that computes in -time if for any , one has , and reaches the state in at most steps.
Expressive power of Turing Machines
Any programming language can be simulated by a Turing Machine. This bold claim is even easy to prove (though we will not do it). In fact the Church thesis states that every physically realizable computation device can be simulated by a Turing Machine. Of course this is not a theorem, but it is a widely accepted belief.
A stronger version of the Church thesis states that any function which is computable in -time (for some constant ) on a physically realizable computation device can be computed by a Turing Machine in -time. Again this is a theorem if one considers specific alternative models of computation (for instance Turing Machines with larger alphabet than , or with multiple tapes), but in general it is just a belief. This stronger version might even be false, as suggested by some recent results on algorithms for quantum computers. However at this point we do not know if quantum computers are physically realizable.
Example: computing the solution of an LP
Consider the problem of computing the solution of a standard LP. That is, we want to construct a Turing Machine that takes as input an LP encoded as a string , and outputs its solution also encoded as a string in . In this setting we need a little bit of caution, in particular we now want the LP to be defined with rational numbers rather than real numbers, since in the former case one has a trivial encoding as a string in . Furthermore to bound the time complexity of the Turing Machine (which would for instance be the code for the ellipsoid method together with Khachiyan’s rounding) we now need to control the size of the rational numbers we are manipulating! These subtleties make the formal treament of LP much more complicated than what we did earlier in this course. However in the end one can still prove that the solution to rational LPs can be computed in polynomial-time (that is in -time for some constant ).
Is everything computable?
This section is slightly outside of the theme of optimization, but the idea is so beautiful that I could not resist presenting it. Now that we have a model of computation we can ask this very simple question: is it possible for any function to find a Turing Machine that computes ? This question has mind-blowing philosophical interpretation but I won’t go into that. At the very least it is a very cool question! The answer is equally awesome: there exists functions which are uncomputable. We can even exhibit such a function in a few lines, using a diagonalization argument (which is reminiscent from Cantor’s proof of uncountability of real numbers).
First note that any Turing Machine can be encoded as a string in (one just need to code the state space and the transition function). We denote by the encoding of as a string, and by the Turing Machine that corresponds to the encoding (of course we can assume that any string encodes some Turing Machine). Now consider the function that outputs if and only if outputs on . Assume that is computable, so that there exists that can compute . In particular outputs on if and only if , which happens if and only if . We arrive at a contradiction, and thus this function is not computable!
Let us step back for a moment and recall why we went into the trouble of defining precisely a model of computation. Our primary goal was to show some impossibility results, that would yield ‘some kind’ of optimality for the algorithms that we derived in the previous lectures. Unfortunately we have absolutely no idea on how to do that. We basically cannot say anything beyond trivial statements such as: ‘well your Turing Machine better has to read the entire input…’ (this is not entirely true, for instance slightly more refined things can be said when you restrict the space complexity of your Turing Machine, see this paper by Ryan Williams). This state of affair is immensely disappointing in my opinion. In the next lecture I will present the now standard NP completness argument that can be interpreted as a way to certify that certain problems are indeed very difficult.