Lecture 1. Introduction

There are many interesting models of random graphs, and there a lot of different questions that can be asked about them. In this seminar we will only have the time cover a few of the most classical results. The goal of this first lecture is to give an informal overview of what we intend to cover this semester.

What is a random graph?

Let {G=(V,E)} be a graph, where {V} is the set of vertices and {E} is the set of edges (that is, a set of pairs of vertices). We will consider only simple graphs, that is, graphs with undirected edges and with no self-edges. The graph structure is compactly described by the incidence matrix {H=(\eta_{ij})_{i,j\in V}}, where {\eta_{ij}=\mathbf{1}_{(i,j)\in E}}. Thus {\eta_{ii}=0} and {\eta_{ij}=\eta_{ji}} for each {i,j\in V}.

Large, complex graphs and networks appear everywhere: the internet, social networks, etc.; and are often formed by some random process. One natural way to construct large complex graphs with interesting structure is to choose the incidence matrix randomly, which yields random graphs. Such models exhibit many fascinating probabilistic phenomena.

Erdös-Rényi graphs

Let {G=(V,E)} be a random graph on {|V|=n} vertices, such that {(\eta_{ij})_{i<j}} is a collection of i.i.d. Bernoulli random variables with {\mathbf{P}[\eta_{ij}=1]=p}, {\mathbf{P}[\eta_{ij}=0]=1-p}. Such a random graph is called Erdös-Rényi graph, and we denote this model by {G(n,p)}.

The Erdös-Rényi graph is the simplest random graph model: for example, in a group of {n} people, each pair are friends, independently, with probability {p}. Unfortunately, this is not a realistic model of most real-world networks. On the other hand, such models are used in engineering applications (for example in coding theory), and already the theory of this simple model is very rich and should be understood before more complicated models can be tackled. This model was the first systematically studied, albeit a slightly different form, by Erdös and Rényi in the late 1950s. More complicated models capture certain features of real-world graphs that are not well described by the Erdös-Rényi model, such as the small world and preferential attachment models; see the book by Durrett or the lecture notes by van der Hofstad, for example. Unfortunately, we will not have time to cover such models in this seminar.

The general question we will aim to address is: “What does a large Erdös-Rényi graph look like?” This question is rather vague. It turns out that there are two basic regimes in which different questions are of interest. In order to describe these two regimes, let us define the “complexity” of a graph {G=(V,E)} as

\displaystyle \mathrm{complexity}(G)=|E|-|V|+1.

This definition is really intended for connected graphs, where it is a meaningful quantity.

lecture1-complexity

For an Erdös-Rényi random graph {G(n,p)} we have

\displaystyle \mathbf{E}[\mathrm{complexity}(G(n,p))]=\binom{n}{2}p-n+1,

since {|E| = \sum_{i<j} \eta_{ij}}.  Of course, we do not know at this point whether the Erdös-Rényi graph is connected (this will be discussed below), but this formula can serve as an informal guide to motivate the two regimes of interest.

  1. Fixed edge probability. Fix {p\in(0,1)} and let {n} go to infinity. We expect such a graph to be very dense: on average, a fraction {p} of all possible edges is present, and each vertex is connected with {(n-1)p} other vertices. In this regime the expected complexity of the graph blows up, that is, {\mathbf{E}[\mathrm{complexity}(G(n,p))] \sim n^2 \rightarrow \infty}. So, we have to choose questions to study that are interesting for a dense graph where the structure is rich: we need a “rich” way of measuring properties in this regime.
  2. Low complexity. Take {p \sim \frac{c}{n}} for some {c>0} and let {n} go to infinity. This is the largest edge probability that we can take so that the expected complexity of the graph does not blow up, since the number of edges and the number of vertices are of the same order. In this regime we get different behavior of the graphs depending on the value of {c}.

Clearly, the random graphs in regimes 1 and 2 will look quite different.

Regime 1: Fixed edge probability

In this regime the graph is very rich. What questions are meaningful? Heuristically, a rich graph must contain interesting subgraphs, and must possess a complex combinatorial structure. We now state two results in this spirit that we are going to prove in the following lectures. The proofs of these results are nice examples of use of the probabilistic method.

Clique number. A clique in a graph {G=(V,E)} is a complete subgraph, that is, a subset {W\subseteq V} such that {(i,j)\in E} for every {i,j\in W}, {i\neq j}. If {G} describes friendships, a clique is a group of mutual friends. Intuitively, a rich graph must contain a large clique. The clique number {\omega(G)} is the cardinality of the largest clique in {G}.

Theorem 1 Let {G(n,p)} be an Erdös-Rényi graph. Then

\displaystyle \frac{\omega(G(n,\frac{1}{2}))}{2 \log_2 n} \overset{n\rightarrow \infty}{\longrightarrow} 1\quad\text{in probability}.

On a heuristic level, we see that this result makes sense since

\mathbf{E}[\mathrm{number~of~cliques~of~size~}k] =  \binom{n}{k}\left(\frac{1}{2}\right)^{\binom{k}{2}} \sim n^k 2^{-k^2/2} =  2^{k \log_2 n - k^2/2},

from which we see that {k\sim 2 \log_2 n} is the critical case. Of course, one could also study similar questions for other types of subgraphs, but we will not do that.

Chromatic number. A coloring of {(V,E)} is a color assignment to each vertex in {V} such that for each {(i,j)\in E}, {i} and {j} have a different color. The chromatic number {\chi(G)} is the smallest number of colors needed to color {G}. It is clear that the chromatic number of a graph tells us something about the complexity of the graph.

lecture1-chromatic

Theorem 2 Let {G(n,p)} be an Erdös-Rényi graph. Then

\displaystyle \frac{\chi(G(n,\frac{1}{2}))}{n/ (2 \log_2 n)} \overset{n\rightarrow \infty}{\longrightarrow} 1\quad\text{in probability}.

So, in the Erdös-Rényi graph the chromatic number grows as {n/ (2 \log_2 n)}, smaller than in a complete graph were the chromatic number is {n}, but much larger than in trees which can be colored with only 2 colors regardless of their size. That the same factor {2 \log_2 n} appears in both theorems above is not a coincidence: there is a relation between clique number and chromatic number that forms the basis for the proof.

Regime 2: Low complexity

In this regime we consider the case {p\sim \frac{c}{n}}, so each vertex has {c} neighbors on average:

\mathbf{E}[\mathrm{number~of~neighbors~of~}v] = \frac{c}{n} (n-1) \sim c.

In this case we expect the graph to be much simpler than in the previous regime. If {c} is small, for example, we expect the graph to be very disconnected, with a lot of small connected pieces. It turns out that this can be made precise. The following is a somewhat informal statement of a theorem that we will prove in detail in the following lectures.

“Theorem” 3 For {v\in V}, let {C_v} be the connected component of {G(n,\frac{c}{n})} that contains {v}.

  1. If {c<1}, then {\max_{v\in V} |C_v| \approx \log n}.
  2. If {c>1}, then {\max_{v\in V} |C_v| \approx n} and all other components have size {\lesssim\log n}.
  3. If {c=1}, then {\max_{v\in V} |C_v| \approx n^{2/3}} and there are several large components.

Moreover, all {C_v} with {|C_v| \lesssim \log n} are either trees or have one cycle with high probability.

So, {G(n,\frac{c}{n})} decomposes into many disconnected, simple pieces in the subcritical case {c<1}. In the supercritical case {c>1} we get emergence of one giant component that is complex, with some remaining disconnected simple pieces. The critical case {c=0} is the most complicated. In this case there are additional phase transitions near the critical window {p\sim \frac{1}{n} \pm \frac{c}{n^{4/3}}}, with non trivial limit distribution for the component sizes and complexities. [Note that in all these cases the Erdös-Rényi graph will have multiple connected components; the threshold for the entire graph to be connected turns out to be p\sim\frac{\log n}{n}.]

How do we prove such a theorem? We will use a dynamical method: start at {v}, then explore {C_v} step by step much as in a game of Minesweeper. We then analyze the hitting times of this “random walk”. One can also study Brownian scaling limits of such random walks, which provides a mechanism to define “infinite limiting objects” in this theory.

Many thanks to Patrick Rebeschini for scribing this lecture!

01. March 2013 by Ramon van Handel
Categories: Random graphs | Comments Off

css.php