Update (4/3/2014): I believe I have solved the conjecture, and proven it to be correct. I will make a preprint available shortly(see link below). The original blog post remains available below. — Tom
Preprint available here: “An extremal inequality for long Markov chains“.
I have an extremal conjecture that I have been working on intermittently with some colleagues (including Jiantao Jiao, Tsachy Weissman, Chandra Nair, and Kartik Venkat). Despite our efforts, we have not been able to prove it. Hence, I thought I would experiment with online collaboration by offering it to the broader IT community.
In order to make things interesting, we are offering a $1000 prize for the first correct proof or counterexample! Feel free to post your thoughts in the public comments. You can also email me if you have questions or want to bounce some ideas off me.
Although I have no way of enforcing them, please abide by the following ground rules:
- If you decide to work on this conjecture, please send me an email to let me know that you are doing so. As part of this experiment with online collaboration, I want to gauge how many people become involved at various degrees.
- If you solve the conjecture or make significant progress, please keep me informed.
- If you repost this conjecture, or publish any results, please cite this blog post appropriately.
One final disclaimer: this post is meant to be a brief introduction to the conjecture, with a few partial results to get the conversation started; it is not an exhaustive account of the approaches we have tried.
1. The Conjecture
Conjecture 1. Suppose
are jointly Gaussian, each with unit variance and correlation
. Then, for any
satisfying
, the following inequality holds:
![\displaystyle 2^{-2I(Y;U)} 2^{-2I(X;V|U)} \geq (1-\rho^2)+ \rho^2 2^{-2I(X;U)} 2^{-2I(Y;V|U)} . \ \ \ \ \ (1)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+2%5E%7B-2I%28Y%3BU%29%7D+2%5E%7B-2I%28X%3BV%7CU%29%7D+%5Cgeq+%281-%5Crho%5E2%29%2B+%5Crho%5E2+2%5E%7B-2I%28X%3BU%29%7D+2%5E%7B-2I%28Y%3BV%7CU%29%7D+.+%5C+%5C+%5C+%5C+%5C+%281%29&bg=ffffff&fg=000000&s=0&c=20201002)
2. Partial Results
There are several partial results which suggest the validity of Conjecture 1. Moreover, numerical experiments have not produced a counterexample.
Conjecture 1 extends the following well-known consequence of the conditional entropy power inequality to include long Markov chains.
Lemma 1 (Oohama, 1997). Suppose
are jointly Gaussian, each with unit variance and correlation
. Then, for any
satisfying
, the following inequality holds:
![\displaystyle 2^{-2 I(Y;U)} \geq 1-\rho^2+\rho^2 2^{-2I(X;U)}. \ \ \ \ \ (2)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+2%5E%7B-2+I%28Y%3BU%29%7D+%5Cgeq+1-%5Crho%5E2%2B%5Crho%5E2+2%5E%7B-2I%28X%3BU%29%7D.+%5C+%5C+%5C+%5C+%5C+%282%29&bg=ffffff&fg=000000&s=0&c=20201002)
Proof: Consider any
satisfying
. Let
denote the random variables
conditioned on
. By Markovity and definition of
, we have that
, where
is independent of
. Hence, the conditional entropy power inequality implies that
![\displaystyle 2^{2h(Y|U)} \geq \rho^2 2^{2h(X|U)} + 2 \pi e(1-\rho^2) = 2 \pi e \rho^2 2^{-2I(X;U)} + 2 \pi e(1-\rho^2). \ \ \ \ \ (3)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+2%5E%7B2h%28Y%7CU%29%7D+%5Cgeq+%5Crho%5E2+2%5E%7B2h%28X%7CU%29%7D+%2B+2+%5Cpi+e%281-%5Crho%5E2%29+%3D+2+%5Cpi+e+%5Crho%5E2+2%5E%7B-2I%28X%3BU%29%7D+%2B+2+%5Cpi+e%281-%5Crho%5E2%29.+%5C+%5C+%5C+%5C+%5C+%283%29&bg=ffffff&fg=000000&s=0&c=20201002)
From here, the lemma easily follows. ![\Box](https://s0.wp.com/latex.php?latex=%5CBox&bg=ffffff&fg=000000&s=0&c=20201002)
Lemma 1 can be applied to prove the following special case of Conjecture 1. This result subsumes most of the special cases I can think of analyzing analytically.
Proposition 1. Suppose
are jointly Gaussian, each with unit variance and correlation
. If
are jointly Gaussian and
, then
![\displaystyle 2^{-2I(Y;U)} 2^{-2I(X;V|U)} \geq (1-\rho^2)+ \rho^2 2^{-2I(X;U)} 2^{-2I(Y;V|U)}. \ \ \ \ \ (4)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+2%5E%7B-2I%28Y%3BU%29%7D+2%5E%7B-2I%28X%3BV%7CU%29%7D+%5Cgeq+%281-%5Crho%5E2%29%2B+%5Crho%5E2+2%5E%7B-2I%28X%3BU%29%7D+2%5E%7B-2I%28Y%3BV%7CU%29%7D.+%5C+%5C+%5C+%5C+%5C+%284%29&bg=ffffff&fg=000000&s=0&c=20201002)
Proof: Without loss of generality, we can assume that
has zero mean and unit variance. Define
. Since
are jointly Gaussian, we have
![\displaystyle I(X;U) =\frac{1}{2}\log\frac{1}{1-\rho_u^2} \ \ \ \ \ (5)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+I%28X%3BU%29+%3D%5Cfrac%7B1%7D%7B2%7D%5Clog%5Cfrac%7B1%7D%7B1-%5Crho_u%5E2%7D+%5C+%5C+%5C+%5C+%5C+%285%29&bg=ffffff&fg=000000&s=0&c=20201002)
![\displaystyle I(Y;U) =\frac{1}{2}\log\frac{1}{1-\rho^2\rho_u^2}. \ \ \ \ \ (6)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+I%28Y%3BU%29+%3D%5Cfrac%7B1%7D%7B2%7D%5Clog%5Cfrac%7B1%7D%7B1-%5Crho%5E2%5Crho_u%5E2%7D.+%5C+%5C+%5C+%5C+%5C+%286%29&bg=ffffff&fg=000000&s=0&c=20201002)
Let
denote the random variables
conditioned on
, respectively. Define
to be the correlation coefficient between
and
. It is readily verified that
![\displaystyle \rho_{XY|u} = \frac{\rho\sqrt{1-\rho_u^2}}{\sqrt{1-\rho^2\rho_u^2}}, \ \ \ \ \ (7)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Crho_%7BXY%7Cu%7D+%3D+%5Cfrac%7B%5Crho%5Csqrt%7B1-%5Crho_u%5E2%7D%7D%7B%5Csqrt%7B1-%5Crho%5E2%5Crho_u%5E2%7D%7D%2C+%5C+%5C+%5C+%5C+%5C+%287%29&bg=ffffff&fg=000000&s=0&c=20201002)
which does not depend on the particular value of
. By plugging (5)–(7) into (4), we see that (4) is equivalent to
![\displaystyle 2^{-2I(X;V|U)} \geq (1-\rho_{XY|u}^2)+ \rho_{XY|u}^2 2^{-2I(Y;V|U)}. \ \ \ \ \ (8)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+2%5E%7B-2I%28X%3BV%7CU%29%7D+%5Cgeq+%281-%5Crho_%7BXY%7Cu%7D%5E2%29%2B+%5Crho_%7BXY%7Cu%7D%5E2+2%5E%7B-2I%28Y%3BV%7CU%29%7D.+%5C+%5C+%5C+%5C+%5C+%288%29&bg=ffffff&fg=000000&s=0&c=20201002)
For every
, the variables
are jointly Gaussian with correlation coefficient
and
form a Markov chain, hence Lemma 1 implies
![\displaystyle 2^{-2I(X_u;V_u)} \geq (1-\rho_{XY|u}^2)+ \rho_{XY|u}^2 2^{-2I(Y_u;V_u)}. \ \ \ \ \ (9)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+2%5E%7B-2I%28X_u%3BV_u%29%7D+%5Cgeq+%281-%5Crho_%7BXY%7Cu%7D%5E2%29%2B+%5Crho_%7BXY%7Cu%7D%5E2+2%5E%7B-2I%28Y_u%3BV_u%29%7D.+%5C+%5C+%5C+%5C+%5C+%289%29&bg=ffffff&fg=000000&s=0&c=20201002)
The desired inequality (8) follows by convexity of
![\displaystyle \log\left[(1-\rho_{XY|u}^2)+ \rho_{XY|u}^2 2^{-2z}\right] \ \ \ \ \ (10)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Clog%5Cleft%5B%281-%5Crho_%7BXY%7Cu%7D%5E2%29%2B+%5Crho_%7BXY%7Cu%7D%5E2+2%5E%7B-2z%7D%5Cright%5D+%5C+%5C+%5C+%5C+%5C+%2810%29&bg=ffffff&fg=000000&s=0&c=20201002)
as a function of
. ![\Box](https://s0.wp.com/latex.php?latex=%5CBox&bg=ffffff&fg=000000&s=0&c=20201002)
3. Equivalent Forms
There are many equivalent forms of Conjecture 1. For example, (1) can be replaced by the symmetric inequality
![\displaystyle 2^{-2(I(X;V)+I(Y;U))} \geq (1-\rho^2)2^{-2I(U;V)}+ \rho^2 2^{-2(I(X;U)+I(Y;V))}. \ \ \ \ \ (11)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+2%5E%7B-2%28I%28X%3BV%29%2BI%28Y%3BU%29%29%7D+%5Cgeq+%281-%5Crho%5E2%292%5E%7B-2I%28U%3BV%29%7D%2B+%5Crho%5E2+2%5E%7B-2%28I%28X%3BU%29%2BI%28Y%3BV%29%29%7D.+%5C+%5C+%5C+%5C+%5C+%2811%29&bg=ffffff&fg=000000&s=0&c=20201002)
Alternatively, we can consider dual forms of Conjecture 1. For instance, one such form is stated as follows:
Conjecture 1′. Suppose
are jointly Gaussian, each with unit variance and correlation
. For
, the infimum of
![\displaystyle I(X,Y;U,V)-\lambda\Big(I(X;UV)+I(Y;UV)\Big), \ \ \ \ \ (12)](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+I%28X%2CY%3BU%2CV%29-%5Clambda%5CBig%28I%28X%3BUV%29%2BI%28Y%3BUV%29%5CBig%29%2C+%5C+%5C+%5C+%5C+%5C+%2812%29&bg=ffffff&fg=000000&s=0&c=20201002)
taken over all
satisfying
is attained when
are jointly Gaussian.