## Lecture 6. Giant component (3)

Let us begin with a brief recap from the previous lecture. We consider the Erdös-Rényi random graph in the supercritical case . Recall that denotes the connected component of the graph that contains the vertex . Our goal is to prove the existence of the giant component with size , while the remaining components have size .

Fix sufficiently large (to be chosen in the proof), and define the set

of vertices contained in “large” components. The proof consists of two parts:

**Part 1:****Part 2:**in probability.

Part 1 states that all the sufficiently large components must intersect, forming the giant component. Part 2 counts the number of vertices in the giant component. Part 2 was proved in the previous lecture. The goal of this lecture is to prove Part 1, which completes the proof of the giant component.

**Overview**

As in the previous lectures, the central idea in the study of the giant component is the *exploration process* , where

We have seen that , where is a random walk with increments

When , we have . Thus is approximately a random walk with *positive drift*. The intuitive idea behind the proof of Part 1 is as follows. Initially, the random walk can hit rapidly, in which case the component is small. However, if the random walk drifts away from zero, then with high probability it will never hit zero, in which case the component must keep growing until the random walk approximation is no longer accurate. Thus there do not exist any components of intermediate size: each component is either very small () or very large (we will show , but the precise exponent is not important).

We now want to argue that any pair of large components must necessarily intersect. Consider two disjoint sets and of vertices of size . As each edge is present in the graph with probability , the probability that there is no edge between and is

We therefore expect that any pair of large components must intersect with high probability. The problem with this argument is that we assumed that the sets and are nonrandom, while the random sets themselves depend on the edge structure of the random graph (so the events and are highly correlated). To actually implement this idea, we therefore need a little bit more sophisticated approach.

To make the proof work, we revisit more carefully our earlier random walk argument. The process has positive drift as . Thus the process is still approximately a random walk with positive drift! Applying the above intuition, either dies rapidly (the component is small), or grows linearly in as is illustrated in the following figure:

This means that the exploration process for a component of size will not only grow large () with high probability, but that the exploration process will also possess a large number of active vertices (. To prove that all large components intersect, we will run different exploration processes simultaneously starting from different vertices. We will show that if two of these processes reach a large number of active vertices then there must be an edge between them with high probability, and thus the corresponding components must coincide. This resolves the dependence problem in our naive argument, as the edges between the sets of active vertices have not yet been explored and are therefore independent of the history of the exploration process.

**The component size dichotomy**

We now begin the proof in earnest. We will first show the dichotomy between large and small components: either the component size is , or the number of active vertices grows linearly up to time . To be precise, we consider the following event:

Our goal is to show that is large.

Define the stopping time

We can write

Now suppose and . Then , as exploration process is alive at time and stays alive until time . We can therefore write

To bound the probabilities inside the sum, we compare to a suitable random walk.

**The random walk argument**

To bound the probability that , we must introduce a comparison random walk that lies *beneath* . We use the same construction as was used in the previous lecture. Let

where , are i.i.d. random variables independent of , (the same used in the exploration process), and is the set of the first components of (if , then and thus is undefined; then we simply add variables ).

As in the previous lecture, we have:

- is a random walk with increments.
- whenever and .

Now suppose that and . Then

We therefore obtain for

Thus computing reduces to compute the tail probability of a random walk (or, in less fancy terms, a sum of i.i.d. random variables). That is something we know how to do.

Lemma(Chernoff bound).Let . Then

**Proof.** Let . Then

The result follows by optimizing over .

Note that . We therefore have by the Chernoff bound

for all (here depends only on and ). In particular, we have

provided is sufficiently large. Thus we can estimate

which goes to zero as provided that is chosen sufficiently large. In particular, the component size dichotomy follows: choosing any , we obtain

**Remark:** Unlike in the proof of Part 2 in the previous lecture, here we *do* need to choose sufficiently large for the proof to work. If is too small, then the random walk cannot move sufficiently far away from zero to ensure that it will never return. In particular, even in the supercritical case, the second largest component has size of order .

**Large components must intersect**

To complete the proof, it remains to show that all large components must intersect. To do this, we will run several exploration processes at once starting from different vertices. If the sets of active vertices of two of these processes grow large, then there must be an edge between them with high probability, and thus the corresponding components intersect. Let us make this argument precise.

In the following, we denote by the exploration process started at . For each such process, we define the corresponding set that we have investigated above:

We have shown above that, provided , we have

We can therefore estimate

Now note that by time , the exploration process has only explored edges where (or ), and similarly for . It follows that

In particular, if are disjoint subsets of vertices, then

On the other hand, implies that must be disjoint at every time . Thus if , there can be no edges between vertices in and at any time (if such an edge exists, then the vertices connected by this edge will eventually be explored by both exploration processes, and then the sets of removed vertices will no longer be disjoint). Therefore,

Thus we finally obtain

and the proof of the giant component theorem is complete.

*Many thanks to Quentin Berthet for scribing this lecture!*