Giant component: final remarks

The past three lectures were devoted to the giant component theorem:

Theorem Let C_v be the connected component of G(n, \frac{c}{n}) that contains v\in [n].

  1. If c<1, then \max_v|C_v|= O(\log n) in probability.
  2. If c>1, then \max_v|C_v|\sim (1-\rho) n in probability, for some 0<\rho<1.
  3. If c=1, then \max_v|C_v|\sim n^{2/3} in distribution.

We proved only the first (subcritical) and second (supercritical) cases: our presentation was largely inspired by the treatment of Janson, Luczak, and Rucinski and Durrett. We have omitted the critical case, however, as the last two lectures of the semester will be on another topic. The goal of this post is to provide some final remarks and references on the giant component theorem.

Retrospective

At first sight, the “double jump” in the giant component theorem looks quite shocking. In hindsight, however, this does not seem quite so miraculous, as it mirrors an elementary phenomenon that is covered in many introductory probability courses: given a (nice) random walk S_t with initial condition S_0=a, define the hitting time \tau:=\inf\{t:S_t=b\} for some b<a. Then there are three cases:

  1. If S_t has negative drift, then \mathbb{E}[\tau]<\infty. In fact, the random variable \tau has a light (exponential) tail.
  2. If S_t has positive drift, then \mathbb{P}[\tau=\infty]>0.
  3. If S_t has zero drift, then \tau<\infty a.s. but \mathbb{E}[\tau]=\infty. That is, the random variable \tau has a heavy tail.

This “double jump” in the behavior of the hitting probabilities of a random walk is directly analogous to the behavior of the connected components of an Erdös-Rényi graph, and this was indeed the basic idea behind the proofs given in the previous lectures. Of course, it remains a bit of a miracle that the random walk approximation of the exploration process, which only holds for small times, is sufficiently powerful that it describes so completely the behavior of the random graph.

The critical case

In the subcritical case, the size of the largest component is of order \log n because the hitting time of a random walk with negative drift has an exponential tail: that is, we proved

    \[\mathbb{P}[\tau\ge k] \le e^{-\alpha k}    \quad\Longrightarrow\quad    \mathbb{P}\Big[\max_v|C_v| \ge \gamma\log n\Big] \le    n\,\mathbb{P}[\tau\ge\gamma\log n] \le n^{1-\alpha\gamma}\]

which goes to zero as n\to\infty for \gamma>1/\alpha.

Similarly, we would expect that we can obtain the size of the largest component in the critical case if we understand the heavy tail behavior of the hitting time of a random walk with zero drift. This is in fact the case. Indeed, when there is zero drift, one can show that

    \[P[\tau\ge k] \lesssim \frac{1}{\sqrt{k}}.\]

The crude union bound argument used above now does not give the correct answer, but an only slightly better argument is needed. Indeed, note that

    \begin{eqnarray*}     \mathbb{P}\Big[\max_v|C_v| \ge \gamma n^{2/3}\Big] &\le&     \mathbb{P}[|\{v:|C_v|\ge \gamma n^{2/3}\}|\ge \gamma n^{2/3}] \\     &\le&     \frac{\mathbb{E}[|\{v:|C_v|\ge \gamma n^{2/3}\}|]}{\gamma n^{2/3}} \\     &=&     \frac{n\,\mathbb{P}[\tau\ge \gamma n^{2/3}]}{\gamma n^{2/3}}     \lesssim \frac{1}{\gamma^{3/2}}. \end{eqnarray*}

Therefore, n^{-2/3}\max_v|C_v|=O(1). With some further work, a corresponding lower bound can also be proved. See the paper by Nachmias and Peres or the notes by van der Hofstad for the details.

It turns out that in the critical case n^{-2/3}\max_v|C_v| is not only bounded in probability, but in fact converges weakly to some limiting distribution. This distribution, and much more, is beautifully described by Aldous in terms of Brownian excursions. This is an interesting example of the application of stochastic analysis to discrete probability; unfortunately, we do not have the time to cover it.

In a different direction, it turns out that various additional phase transitions appear when we consider a finer scaling, for example, in the “critical window” c=1\pm c' n^{-1/3}. For an overview of the various transitions, see, for example, section 11.1 in Alon and Spencer.

Connectivity threshold

Rather than considering the size of the largest component, one could ask when the entire Erdös-Rényi graph is connected. Note that when 1<c<\infty, the constant \rho in the size (1-\rho)n of the giant component is always strictly positive, so the graph is not connected. Therefore, in order for the entire graph to be connected, we must let c\to\infty (that is, the edge probability p must be superlinear). It turns out that the appropriate scaling for this question is c\sim \log n, and another phase transition arises here.

Theorem. Let p=\frac{a\log n}{n}. If a>1, then the Erdös-Rényi graph G(n,p) is connected with probability tending to 1 as n\to\infty. If a<1, the graph is connected with probability tending to 0 as n\to\infty.

To get some intuition, consider the probability that a vertex v is isolated (that is, disconnected from every other vertex):

    \[\mathbb{P}[v\text{ is isolated}] =    \mathbb{P}[\eta_{vw}=0\text{ for all }w\ne v] =    \bigg(1-\frac{c}{n}\bigg)^{n-1}\sim e^{-c}.\]

Thus for c=a\log n, we have

    \[\mathbb{E}[|\{v:v\text{ is isolated}\}|] \sim    n \,e^{-a\log n} = n^{1-a}.\]

In particular, if a<1, then there must exist an isolated vertex with positive probability as n\to\infty, in which case the graph is not connected (in fact, it is not hard to show that the variance of the number of isolated components is of the same order as its mean, so that the probability that the graph is connected tends to zero). Somewhat miraculously, it turns out that when there are no isolated vertices, then the graph must already be connected, so that we do indeed obtain the sharp transition described in the Theorem above. For a proof of this fact (by a clever combinatorial argument) see, for example, the lecture notes by van der Hofstad. Alternatively, one can use a random walk argument entirely in the spirit of the proofs in the previous lectures to prove that the random graph is connected for a>1: by running simultaneous exploration processes from different vertices as we did in the proof of the supercritical case, one can show that all connected components must intersect when a>1 and thus the entire graph must be connected. See section 2.8 in Durrett for such an argument.

28. April 2013 by Ramon van Handel
Categories: Random graphs | Leave a comment

css.php