# An Inequality Involving a Product of Mutual Informations

Many inequalities in information theory have entropies or mutual informations appearing as linear additive terms (e.g. Shannon-type inequalities). It’s also not uncommon to see entropies appearing as exponents (e.g. entropy power inequality). But perhaps it’s not immediately seen how products of mutual informations make sense.

Recently I have encountered an inequality involving a product of mutual informations, which I cannot find a good way to prove (or disprove, though some numerics and asymptotic analysis seem to suggest its validity). I would much appreciate it if someone could be smart and gracious enough to provide a proof, or counter example, or generalization.

The formulation is quite simple: suppose $X,Y$ are binary random variables with  $\displaystyle P(X=0)=P(Y=0)=1/2$, and $Z$ is another random variable such that $X-Y-Z$ forms a Markov chain. The claim is that $I(X;Z)\ge I(X;Y)I(Y;Z)$. (Note that the left side is also upper bounded by either one of the two factors on the right side, by the well-known data processing  inequality.

At first glance this inequality seems absurd because different units appear on the two sides; but this may be resolved by considering that $Y$ has only one bit of information.

For a given joint distribution of $X$ and $Y$, the equality can be achieved when $Z$ is an injective function of $Y$.

———————————-

Update: the inequality has been proved, thanks very much for the help by Sudeep Kamath (see comments). An alternative proof (which avoids computing the monotonicity of  $f(r) = D(r*\epsilon||1/2)/D(r||1/2) = (1-h(r*\epsilon))/(1-h(r)))$ is sketched as follows: by convexity of the function $h(\epsilon*h^{-1}(x))$ (see for example, proof of Mrs Gerbers Lemma), we have $H(X|Z=z)=h(\epsilon*h^{-1}(H(Y|Z=z)))\le H(Y|Z=z)+(1-H(Y|Z=z))h(\epsilon)$ for any $z$, where $\epsilon:=Prob(X\neq Y)$. Averaging over $z$ under $P_Z$ gives $H(X|Z)\le H(Y|Z)+(1-H(Y|Z))h(\epsilon)$, and the claim follows after rearrangements.