Lecture 8. Entropic cone and matroids
This lecture introduces the notion of the entropic cone and its connection with entropy inequalities.
Recall that if is a discrete random variable with distribution , the entropy of is defined as
Now let be (not necessarily independent) discrete random variables. It will be convenient to define the entropy function
As the entropy depends only on the probabilities of the outcomes of a random variable and not on its values, we will assume without loss of generality in the sequel that take values in .
For any probability on , let be the -dimensional vector
We can now define the entropic cone.
Definition. The entropic cone is the set
i.e., the set of all vectors that can be obtained as the entropy function of variables.
Question. Can we characterize ?
- If , then , as there exist random variables of arbitrary (nonnegative) entropy.
- If , the vector for is
What constraints must this vector satisfy? We must certainly have
which follows from the chain rule together with positivity for the first inequality, and that conditioning reduces entropy for the second. Are these the only constraints? In fact, we can create many vectors given these constraints. For example:
- The vector is obtained by taking such that , and letting .
- The vector is obtained by taking to be i.i.d. copies of .
- Convex combinations of these vectors can be obtained by taking mixtures of their distributions.
In fact, a careful analysis shows that the above constraints completely characterize the entropic cone for : that is,
- If , what happens? Here we are are still subject to the same constraints as in the case , but we pick up some new constraints as well such as
which is equivalent to the inequality .
Evidently, the constraints on the entropic cone correspond to entropy inequalities. What type of inequalities must hold for any given ? Let us think about this question a bit more systematically.
Definition. A set function is called a polymatroid if
- , for all .
- when (monotonicity).
It is not difficult to check that
Lemma. The entropy function is a polymatroid.
Other examples of polymatroids: (1) entropy, mutual information; (2) max-weight for given weights ; (3) flows; (4) cuts (not monotone, but submodular); (5) Von Neumann entropy.
Let us define
Evidently is a polyhedral cone (the intersection of a finite number of halfplanes).
- for .
- is a convex cone for all (but not polyhedral for ).
For the proof, we refer to the book for R. Yeung.
As an example of , we note that the vector is only achievable when for some . Let us also note that convexity of holds in its interior but not on the boundary. However, issues only arise on the boundary of , that is, no new entropy inequalities appear beyond the polymatroid inequalities when .
On the other hand, when , many new inequalities are introduced that actually cause holes to appear within . One such inequality (expressed in terms of mutual information) is
Using the entropic cone
The entropic cone can be used to obtain new information theoretic inequalities. For example, the following question arose in the study of mechanism design in computer science.
Problem. Let be discrete random variables. Define the matrix as . Is positive definite?
When , we have since . Thus all the principal minors of have positive determinant, so is positive definite.
How about ? Note that depends linearly on the entries of the vector (where is the joint distribution of ). Thus if is positive definite for distributions on the extreme rays (ER) of the entropic cone, then must be positive definite for any distribution. More generally:
Proposition. If is convex, then holds for all if and only if for all .
Proof. It suffices to note that
for every , and to use that .
This necessary and sufficient condition for generalizes to a sufficient condition for .
Proposition. If is convex, then holds for all if for all .
As is polyhedral, this simplifies solving problems such as checking positive definiteness of significantly: it suffices to check a finite number of cases, which can essentially be done by hand. It can be shown this way that is always positive definite for , but this can fail for .
It is sometimes of interest to investigate discrete variants of the above questions.
Definition. A matroid is defined by the conditions
- for all ;
- is monotone;
- is submodular.
- Vector matroids (also called -representable matroids, where is a field). Given a matrix with values in , define where denotes the columns induced by .
- Graphical matroids. Let be a graph, and choose . Then define , where the maximum is taking over acyclic subgraphs .
- Entropic matroids. Let . For what distributions of is this a matroid?
Denote by the entropic cone induced by those distributions where take values in . In order for such a distribution to define an entropic matroid, the vector must take values in . Thus we are led to consider the set
Can one characterize what type of matroids can arise as entropic matroids?
Theorem. coincides with the set of all -representable matroids with elements.
For the proof and further results around this theme, see the paper of E. Abbe.
Lecture by Emmanuel Abbe | Scribed by Danny Gitelman