*Ce billet a été traduit de sa version originale en français : **Probabilités – Partie 2**.*

After the previous post about probability theory, here’s the second part, in which I’ll talk about random variables.

The idea of random variables is to have some way of dealing with events for which we do not exactly what happens (for instance, we roll a die), but still want to have some idea of what can happen. The die example is pretty simple, so using random variables may be a bit overkill, but let’s keep examples simple for now.

For a given experiment, we consider a variable, called , and look at all the values it can reach with the associated probability. If my experiment is “rolling a die and looking at its value”, I can define a random variable on the value of a 6-sided die and call it . For a full definition of , I need to provide all the possible values of (what we call the random variable’s *domain*) and their associated probabilities. For a 6-sided die, the values are the numbers from 1 to 6; for a non-loaded die, the probabilities are all equal to . We can write that as follows:

and read “for all in the set of values {1,2,3,4,5,6}, the probability that takes the value equals .

One of the basic ways to have an idea about the behaviour of a random variable is to look at its *expectation*. The expectation of a random variable can be seen as its average value, or as “suppose I roll my die 10000 times, and I average all the results (summing all the results and dividing by 10000), what result would I typically get?”

This expectation (written ) can be computed with the following formula:

which can be read as “sum for all elements in the domain of of the probability that takes the value , times . In the die example, since the domain is all the integer numbers from 1 to 6, I can write

which I can in turn expand as follows:

Since, for my die, all the probabilities are equal to , I can conclude with

So the average value of a die over a large number of experiments is 3.5, as most tabletop gamers would know 😉

Now let’s look at a slightly more complicated example. Suppose that I have dice, and that I want to know how many 6s I can expect in my dice. From a handwavy point of view, we know that we will not get an exact answer for every time we roll dice, but that we can get a rough answer. There’s no reason there should be more or less 6s than 1s, 2s, 3s, 4s or 5s, so generally speaking the dice should be distributed approximately equally in the 6 numbers, so there should be approximately 6s over dice. (The exception to that being me playing Orks in Warhammer 40k, in which case the expected number is approximately 3 6s over 140 dice.) Let us prove that intuition properly.

I define as the random variable representing the number of 6s over dice. The domain of is all the numbers from 0 to . It’s possible to compute the probability to have, for example, exactly 3 6s over dice, and even to get a general formula for dice, but I’m way too lazy to compute all that and sum over and so on. So let’s be clever.

There’s a very neat trick called *linearity of expectation* that says that the expectation of the sum of several random variables is equal to the sum of the expectations of said random variables, which we write

This is true for all random variables and . Beware, though: it’s only true in general for the addition. We cannot say in general that : that’s in particular true if the variables are independent, but it’s not true in general.

Now we’re going to define variables, called so that is the sum of all these variables. We can define, for each variable , the domain {0,1}, and we say that is equal to 1 if and only if the die number 1 shows a 6. The other variables are defined similarly, one for each die. Since I have variables, which take value 1 when their associated die shows a 6, I can write

This is where I use linearity of expectation:

The main trick here is that variables are much simpler to deal with than . With probability , they take value 1; with probability , they take value 0. Consequently, the expectation of is also much easier to compute:

Plugging that in the previous result, we get the expectation of :

which is the result we expected.

Now my examples are pretty simple. But we can use that kind of tools in much more complicated situations. And there’s a fair amount of other tools that allow to estimate things around random variables, and to have a fairly good idea of what’s happening… even if we involve dice in the process.

## One thought on “Intro to probability theory – part 2”