Welcome! (with random graphs)

Welcome to the ‘I’m a bandit’ blog! If you are curious about this title, you can check my newly published book. However, despite its name, this blog will not be about bandits, but rather about more general problems in optimization, probability and statistics. To kick-off the blog I would like to recall one of the most famous result in the theory of random graphs: the largest clique in an Erdös-Rényi random graph is of logarithmic size. This result will form the basis for my second blog post, which should happen next week, where I will discuss more recent results on the clique number of random geometric graphs in high dimension.

In this post (and the next one) we consider graphs ${G=([n], E)}$ with vertices indexed by ${[n]}$ . A random graph is defined by the random variable ${E}$ (the random set of edges). We shall be interested in the clique number ${\omega(G)}$ of a graph ${G}$ which is the size of the largest clique in ${G}$ , that is

$\displaystyle \omega(G) = \max \{|S| : S \subset [n] \; \text{and} \; \forall (i,j) \in S^2, i \neq j, (i,j) \in E \} .$

We will consider sequences of random graphs ${G_n = ([n], E_n)}$ , and we will study the behavior of the random variables ${\omega(G_n)}$ as ${n}$ tends to infinity.

The simplest model of random graphs was put forth by Erdös and Rényi in the early Sixties. The random variable ${E_n}$ in an Erdös-Rényi random graph of size ${n}$ is defined by the following two properties:

(a) For any pair ${(i,j) \in [n]^2, i \neq j}$ , ${\mathop{\mathbb P}((i,j) \in E_n) = 1/2}$ .
(b) The events ${\{(i,j) \in E_n\}}$ are mutually independent.

The clique number of Erdös-Rényi random graphs is very well understood. In this post I would like to prove the following result that gives the asymptotic order of magnitude of the clique number.

Theorem 1 The normalized clique number ${\omega(G_n) / \log_2(n)}$ tends to ${2}$ in probability.

Proof: The proof is divided into two parts. First, using an union bound we show that the clique number cannot be too big. Then we show that the clique number cannot be too small with the second moment method.

For the first part one has

$\displaystyle \begin{array}{rcl} \mathop{\mathbb P}(\omega(G_n) \geq k) & = & \mathop{\mathbb P}( \exists S \subset [n], \; \text{s.t.} \; |S|=k \; \text{and} \; S \; \text{is a clique in} \; G_n) \\ & \leq & \sum_{S \subset [n], |S|=k} \mathop{\mathbb P}(S \; \text{is a clique in} \; G_n) \notag \\ & = & {n \choose k} \left(\frac12\right)^{{k \choose 2}} \\ & \leq & n^k 2^{- \frac{k(k-1)}{2}} \\ & = & 2^{k \log_2(n) - \frac{k(k-1)}{2}}. \end{array}$

Thus, ${\forall \epsilon >0}$ , ${\mathop{\mathbb P}(\omega(G_n) \geq (2+\epsilon) \log_2(n)) \rightarrow_{n \rightarrow +\infty} 0}$ .

For the second part it is useful to introduce a few notation. Let ${X_S = 1\{S \; \text{is a clique in} \; G_n\}}$ , and ${Y_k = \sum_{S \subset [n], |S|=k} X_S}$ . In particular we have that ${\omega(G_n) < k}$ iff ${Y_k=0}$ . Thus we want to show that for ${k=(2-\epsilon) \log_2(n)}$ one has ${\mathop{\mathbb P}(Y_k = 0) \rightarrow_{n \rightarrow +\infty} 0}$ . Now using a simple rewriting, and Markov’s inequality one obtains

$\displaystyle \begin{array}{rcl} \mathop{\mathbb P}(Y_k=0) & \leq & \mathop{\mathbb P}((Y_k - \mathop{\mathbb E} Y_k)^2 = (\mathop{\mathbb E} Y_k)^2) \\ & \leq & \frac{\text{Var}(Y_k)}{(\mathop{\mathbb E} Y_k)^2}. \end{array}$

Note that by linearity of the expectation, ${\mathop{\mathbb E} Y_k = {n \choose k} \mathop{\mathbb E} X_S}$ . Furthermore we can also write

$\displaystyle \text{Var}(Y_k) = \sum_{S \subset [n], |S|=k} \text{Var}(X_S) + \sum_{S, T \subset [n], |S|=|T|=k, S \neq T} \mathop{\mathbb E} X_S X_T - (\mathop{\mathbb E} X_S) (\mathop{\mathbb E} X_T) .$

Since ${X_S}$ are boolean random variables we have ${\text{Var}(X_S) \leq \mathop{\mathbb E} X_S}$ and thus

$\displaystyle \frac{\sum_{S \subset [n], |S|=k} \text{Var}(X_S)}{\left( {n \choose k} \mathop{\mathbb E} X_S \right)^2} \leq \frac{\sum_{S \subset [n], |S|=k} \mathop{\mathbb E} X_S}{\left( {n \choose k} \mathop{\mathbb E} X_S \right)^2} = \frac{1}{{n \choose k} \mathop{\mathbb E} X_S},$

and the latter quantity tends to ${0}$ for ${k=(2-\epsilon) \log_2(n)}$ (see the first part of the proof). Thus it remains to show that the following quantity tends to ${0}$ :

$\displaystyle \frac{\sum_{S, T \subset [n], |S|=|T|=k, S \neq T} \mathop{\mathbb E} X_S X_T - (\mathop{\mathbb E} X_S) (\mathop{\mathbb E} X_T)}{\left({n \choose k} \mathop{\mathbb E} X_S\right)^2} .$

First note that, by the independence of the edges, for ${S, T}$ with ${|S \cap T| \leq 1}$ we have that ${X_S}$ and ${X_T}$ are independent, so that in the numerator of the above quantity one can restrict to ${S, T}$ with ${|S \cap T| \geq 2}$ . Now by an elementary reasoning we have (with ${S_0}$ being an arbitrary subset of ${k}$ vertices)

$\displaystyle \begin{array}{rcl} && \sum_{S, T \subset [n], |S|=|T|=k, S \neq T, |S \cap T| \geq 2} \mathop{\mathbb E} X_S X_T \\ && = \sum_{S, T \subset [n], |S|=|T|=k, S \neq T, |S \cap T| \geq 2} \mathop{\mathbb P}(X_S = 1 \; \text{and} \; X_T = 1) \\ && = \sum_{S, T \subset [n], |S|=|T|=k, S \neq T, |S \cap T| \geq 2} \mathop{\mathbb P}(X_S = 1) \mathop{\mathbb P}(X_T =1 | X_S = 1) \\ && = \left( \sum_{T \subset [n], |T|=k, T \neq S_0, |S_0 \cap T| \geq 2} \mathop{\mathbb P}(X_T = 1 | X_{S_0} = 1) \right) \left(\sum_{S \subset [n], |S|=k} \mathop{\mathbb P}(X_S=1) \right) \\ && = \left( \sum_{T \subset [n], |T|=k, T \neq S_0, |S_0 \cap T| \geq 2} \mathop{\mathbb P}(X_T = 1 | X_{S_0} = 1) \right) \left({n \choose k} \mathop{\mathbb E} X_S \right) . \end{array}$

Thus we are now left with proving that the following quantity goes to ${0}$ :

$\displaystyle \frac{\sum_{T \subset [n], |T|=k, T \neq S_0, |S_0 \cap T| \geq 2} \mathop{\mathbb P}(X_T = 1 | X_{S_0} = 1)}{{n \choose k} \left(\frac12 \right)^{- {k \choose 2}}} . \ \ \ \ \ (1)$

Clearly one has

$\displaystyle \mathop{\mathbb P}(X_T = 1 | X_{S_0} = 1) = \left( \frac12 \right)^{{k \choose 2} - {|T \cap S_0| \choose 2}} ,$

which shows that (1) can be rewritten as

$\displaystyle \sum_{s=2}^k \frac{{k \choose s} {n-k \choose k-s}}{{n \choose k}} 2^{{s \choose 2}} .$

An unpleasant calculation then shows that this term tends to ${0}$ , which finishes the proof. $\Box$

3 Responses to "Welcome! (with random graphs)"

By Abdullah July 29, 2015 - 1:56 pm

In the first line of re-writing P(Y_k=0) with Markov inequality.

How is P(Y=0) *<=* P((Y-EY)^2=(EY)^2), would *=* be more precise ?
By Xin January 17, 2013 - 8:00 am

When you say a sequence of graph G_n, do you mean G_n and G_{n+1} are independent?
Have you considered the case that G_n being subgraph of G_{n+1}, that is you expand the graph on itself.

Thank you!
- By Sebastien Bubeck January 17, 2013 - 8:18 am
  
  In fact it does not matter for what I presented: the distribution of G_{n+1} is the same in the two cases that you describe.
  
  Of course there would be a difference between the two models that you propose if I were to look at events such as ‘does G_{n+1} contains G_n as a subgraph’, but usually this is not the type of questions one is interested in.

Welcome! (with random graphs)

3 Responses to "Welcome! (with random graphs)"

By Abdullah July 29, 2015 - 1:56 pm

By Xin January 17, 2013 - 8:00 am

By Sebastien Bubeck January 17, 2013 - 8:18 am

Leave a reply

Archives

Categories

Recent Posts

Subscribe to Blog via Email

Meta

Blogroll