Lectures 2 and 3. Proof of Kadison-Singer (1)
The goal of the following lectures is to prove the following result about random matrices due to Marcus, Spielman, and Srivastava. This result readily implies a positive resolution of the Kadison-Singer conjecture, as was explained in detail in the previous lecture.
Theorem. Let be independent random vectors in whose distribution has finite support. Suppose that
with positive probability.
Define the random matrix and its characteristic polynomial as
As is positive definite, one representation of the matrix norm is . The proof of the above theorem consists of two main parts:
- Show that with positive probability.
- Show that .
The present post is devoted to the proof of the first statement. The second statement will be the subject of the subsequent lectures.
Interlacing roots of polynomials
How can we prove an identity such as ? The key issue that we face is that the functional is extremely nonlinear, and it is not at all clear how it behaves when you take expectations. This problem already arises in the very simplest setting: what can we say about the locations of the roots of a convex combination of two polynomials and ? In general, there is little hope to say anything concrete. However, if the polynomials and have some special properties, then we are in much better shape.
Lemma. Let and be polynomials that are monic (their leading coefficient is one) and of the same degree. Suppose that the polynomial is real-rooted (all its roots are real) for every . Then
for every .
The message of this lemma is that under the given assumptions, the maximal root of the convex combination interlaces between the maximal roots of and (in fact, a similar property holds for all other roots as well; but we will not need this fact.)
Proof. Let us assume without loss of generality that , so we must prove
Proof of upper bound. Because the polynomials are monic, and are strictly increasing for real to the right of their maximal roots. Thus and are both strictly positive for . But this means that is also strictly positive for , and thus cannot have a root in this region. This implies that
Proof of lower bound. Of course, this is the interesting part. By the same argument as the upper bound, it is clear that if . We can therefore assume without loss of generality that .
We distinguish two cases. Note that for . Suppose first that for all . Then and must look something like this:
This means that for any , we have at and at . Thus must have a root between and by the intermediate value theorem. In particular,
Now suppose that for some , and let be the second largest root of . Then and must look something like this:
(Note that it could be that , or that is a local maximum; then the picture looks slightly different, but this will not change our argument.) We now argue as follows. The polynomial has no roots in the interval , but the polynomial has at least two roots in this interval (counting multiplicities). Thus as we let vary from to , the number of roots of the polynomial must jump at some point from having at least two roots to having zero roots in the interval . The polynomial must therefore look like this:
This means that for any sufficiently small, the polynomial loses two real roots in the interval . As long as these roots do not reappear elsewhere on the real line, this implies that this polynomial has less real roots than its degree: and, therefore, it must have at least two roots in the complex plane away from the real line. This contradicts the assumption that all convex combinations of and are real-rooted.
Thus it only remains to argue what happens if the two roots that are lost when we go from to reappear elsewhere on the real line. Note that for new roots to appear, a local maximum that was strictly below (or a local minimum that was strictly above ) when must have moved above (or below) when . As the graph of a polynomial is continuous in its coefficients, we can always avoid this by choosing to be sufficiently small.
Once we know how to deal with two polynomials, we can easily generalize as follows.
Corollary. Let be monic polynomials of the same degree. Suppose that all convex combinations of are real-rooted. Then
for all convex weights , .
Proof by induction. assume the result is true for a given number of polynomials ; then the result is also true for polynomials by applying the previous lemma.
Mixed characteristic polynomials
In order to apply the interlacing property developed above to the Kadison-Singer problem, we must show that the characteristic polynomial satisfies the necessary assumptions. Recall that the random matrix has a distribution with finite support; that is, is a random variable that can take one of possible values . Our aim is to show that there exists such that
where the expectation is clearly a convex combination of the polynomials . In order to apply the interlacing property, we must show not only that each of these polynomials is real-rooted (this is obvious, as are Hermitian matrices), but also that convex combinations of these polynomials are real-rooted. The latter is far from obvious: we must reason about the properties of convex combinations of determinants, such as . Determinants are nonlinear objects, and so it is not at all clear how to address such questions.
The stroke of luck in the Kadison-Singer problem is that the the matrix is a sum of rank-one matrices. Even though the determinant is nonlinear, it is linear with respect to rank-one perturbations. This crucial fact will allow us to express convex combinations of the determinants that appear in our problem in terms of another determinant, for which it will prove to be easier to establish real-rootedness.
Lemma. (Sylvester’s determinant identity) .
Proof. Let be nonsingular square matrices. Then
By continuity, this remains valid for singular matrices . The result follows by letting be the matrix whose first column is and be the matrix whose first column is (all other entries are zero).
Corollary. is linear in .
Proof. If is nonsingular, we can write
which is linear in . As the limit of linear functions is linear, the result remains valid for singular .
What we would really like is that for a random vector : that would make it very easy to handle convex combinations of characteristic polynomials. Unfortunately, this is wishful thinking: the matrix is not of rank one, so this does not follow from the linearity of rank-one perturbations. Nonetheless, this is not such a bad idea. We can use the linearity of rank-one perturbations to rewrite the characteristic polynomial of a sum of rank-one matrices in terms of a different quantity, which proves to be linear even beyond the rank-one setting.
To this end, let us define the multivariate polynomial
Then . Moreover, we have just shown that is multilinear in for every . Using that a linear function equals its first-order Taylor expansion, we can write
Iterating this device, we obtain
Setting , we obtain the identity
The quantity on the right is called a mixed characteristic polynomial.
Definition. The mixed characteristic polynomial of square matrices is defined as
Note that if all are rank one, we have shown that
That is, the mixed characteristic polynomial of coincides with the characteristic polynomial of when all are rank one. This is no longer true when the matrices are not of rank one! Nonetheless, the mixed characteristic polynomial is an extremely useful device for the following reason.
Lemma. is multilinear in .
Proof. The determinant is a polynomial in the entries of the matrices , and each term in the product is a product of derivatives with respect to distinct variables . In order for such a derivative of one of the terms of the polynomial to be nonzero after we set , this term can contain each at most once: otherwise this term must be proportional to for some , so the derivative of this term will vanish when we set . Thus the only terms that survive in the mixed characteristic polynomials are products of the entries of the matrices , where each appears at most once. Such terms are clearly multilinear in .
A simple consequence of this result is, for example, that
where we have used multilinearity and the independence of the vectors . Let us emphasize that the right-hand side in this expression cannot itself be expressed as the characteristic polynomial of , for example. Nonetheless, we have achieved something very useful: we have expressed a convex combination of determinants in terms of the derivative of a determinant. It turns out that it is much easier to establish real-rootedness of derivatives of determinants, than it is to establish real-rootedness of convex combinations of determinants. Our next order of business is to understand why this is true.
Multivariate polynomials and real stability
Given positive semidefinite matrices , let us define the multivariate polynomial
The mixed characteristic polynomial is obtained from by taking derivatives, followed by fixing . We are going to prove real-rootedness of by showing that the real-rootedness property is preserved by such operations.
Of course, it is not quite clear what we mean by this: is a multivariate polynomial, so its roots are points in . The appropriate extension of real-rootedness to multivariate polynomials is as follows.
Definition. A multivariate polynomial is called stable if it has no roots in the set , and is called real stable if in addition all its coefficients are real.
We are doing to establish the following facts:
- The polynomial is real stable.
- Taking derivatives or fixing parameters preserves real stability.
- A univariate polynomial is real stable if and only if it is real-rooted.
As the (univariate) mixed characteristic polynomial is obtained from by taking derivatives and fixing parameters, this reasoning will show that is real-rooted.
The third assertion is the easiest, and we prove it first.
Lemma. A univariate polynomial is real stable if and only if it is real-rooted.
Proof. Clearly a real-rooted polynomial is real stable. Conversely, suppose a univariate polynomial is real stable. Then it has no roots with . However, if , then also. Therefore, there are also no roots with . It follows that must be real-rooted.
Next, we prove the preservation of real stability under taking derivatives and fixing parameters.
Lemma. Let be a real stable polynomial and let . Then
- is real stable.
- is real stable.
Proof. We begin by proving the first assertion. Clearly has real coefficients, so it suffices to show that it is stable. Suppose to the contrary that has a root in . As the roots of a polynomial vary continuously with its coefficients, the polynomial must also have a root in for sufficiently small. But this contradicts the real stability of .
Remark. A careful proof of the continuity of roots of complex polynomials can be given using Rouché’s theorem. This is left as an exercise to the interested reader.
We now turn to the second assertion. It is again obvious that the derivative of the polynomial has real coefficients, so it remains to establish that it is stable. Assume to the contrary that
for some in .
Define the univariate polynomial , and denote by the roots of . As is real stable, we must have and for all . Moreover, we can express as
By assumption, we have
(as is analytic in a neighborhood of ). But note that
Taking the imaginary part of this expression yields
But and , so we have achieved a contradiction.
Finally, we prove that the polynomial is in fact real stable.
Lemma. Suppose that are positive semidefinite matrices. Then the polynomial
is real stable.
Proof. To show that is real, note that
As are Hermitian, this implies that is real whenever the parameters are all real. Thus the coefficients of must be real, and it remains to prove stability.
To this end, suppose that has a root in . Then the matrix is not invertible, so there exists a vector such that
But and , so the imaginary part of the left-hand size must be strictly positive by the assumption that the root is in . Thus we have achieved a contradiction.
Putting together the previous three results, we have shown the following.
Corollary. The mixed characteristic polynomial is real-rooted whenever are positive semidefinite matrices.
Putting everything together
We are now ready to put together the various ideas developed in this post.
Proposition. Let be independent random vectors in whose distribution has finite support, and let . Then
with positive probability.
Proof. We will apply the interlacing property step by step. Denote by the possible values that the random vector can take. First, note that for every collection of convex weights
by multilinearity. We have shown that the mixed characteristic polynomial on the right is real-rooted. Moreover, it is easily verified from their definition that the polynomials are monic and have the same degree. Therefore, by the interlacing property, there exists a vector (which may depend on the outcomes of ) such that
We now proceed in precisely the same manner with the polynomial on the right-hand side of this inequality to establish the existence of a vector (depending on ) such that
By iterating this argument, we show that
But note that , which is a single realization of ; whereas by multilinearity of the mixed characteristic polynomial and as the random vectors are independent. We have therefore completed the proof.
We have now completed the first step in our program. In the following lectures, we will concern ourselves the with second step: the control of .
Remark. It is only fair to note that, even though the result we are after is a statement about random matrices, essentially no probability appears in the proof: the proof only finds one single realization of the random variable whose maximal root is controlled as desired, and the statement that this happens with positive probability is largely cosmetic (this is a finite probability space, so any realization occurs with positive probability). This highlights, in some sense, the limitations of the probabilistic method in the present context: as the event we are after has very small probability, the usual methods of probabilistic reasoning are not efficient at finding such an event.
Lectures by Philippe Rigollet | Notes by Junwei Lu and Ramon van Handel