Discrete random variables are the ones whose number of values are definite or countable. In this page, we study their major characteristics. Two functions are investigated: probability mass function and cumulative distribution function. Two important parameters, expected value and variance are also examined.
Probability mass function
Probability mass function (or briefly pmf) of a discrete random variable `X`, symbolized as `p(x)`, is defined as:
`p(x)=P(X=x)`(1)
Assume that `x` can take values `x_1,\ x_2,\ ...\ ,\ x_i,\ ...\ ,\ x_n` , then:
- if `x=x_i` : `p(x_i)=P(X=x_i)!=0`
- else `p(x)=P(X=x)=0`
Characteristics of probability mass function
From the discussion above, we can note that:
- `0<=p(x)<=1`
- `sum_(i=1)^n p(x_i)=1`
Example 1 : Rolling an homogeneous die. Define `X` as the number of dots (pips) on the upper face. Then:
- `p(2)=P(X=2)=1//6`
- `p(5)=P(X=5)=1//6`
- `p(1,25)=P(X=1,25)=0`
- `p(8)=P(X=8)=0`
We can use Table 1 to show the probability mass function.
| `x` | 1 | 2 | 3 | 4 | 5 | 6 |
| `p(x)` | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 |
Example 2 : Rolling two homogeneous dice. Define `X` as the sum of pips on the upper faces.
As we have already discussed, `X` is a random variable defined from a sample space of 36 elements. Values of `X` are integers in the interval 2 - 12.
To calculate `p(x)`, we find the event E`x` associates with `x`, then calculate `P("E"_x)`. For example `x=5`.
Then : E5 = { (14), (23), (32), (41) }
So : `p(5)=P(X=5)=P("E"_5)=4//36`
Calculate similarly for other values of `X`, we obtain Table 2.
| `x` | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| `p(x)` | 1/36 | 2/36 | 3/36 | 4/36 | 5/36 | 6/36 | 5/36 | 4/36 | 3/36 | 2/36 | 1/36 |
Cumulative distribution function
In general, cumulative distribution function of a random variable, symbolized as `F(x)`, is defined as:
`F(x)=P(X<=x)`(2)
It means that `F(x)` is the probability that random variable `X` is smaller than or equal to `x`.
For discrete random variable, (2) becomes:
| `F(x)=sum_(x_i<=x) p(x_i)` | (3) |
We have to note that :
- `x` is a real number, it means that `x` belongs to interval (`-oo,\ oo`),
- for discrete random variable, `F(x)` is a discontinuous function, with discontinuity point of the first kind.
Characteristics of cumulative distribution functions of discrete random variables
- `0<=F(x)<=1`
- `F(x)` is a non-decreasing function
- `F(-oo)=0`
- `F(oo)=1`
- `P(a<=X<=b)=F(b)-F(a)`
Example 3 : Continue Example 2 with cumulative distribution function. To calculate `F(x)`, we use formula (3). For example:
`F(5)=p(2)+p(3)+p(4)+p(5)=1/36 + 2/36 + 3/36 + 4/36=10/36`
Calculate similarly for other values of `X`, we obtain Table 3.
| `x` | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| `p(x)` | 1/36 | 2/36 | 3/36 | 4/36 | 5/36 | 6/36 | 5/36 | 4/36 | 3/36 | 2/36 | 1/36 |
| `F(x)` | 1/36 | 3/36 | 6/36 | 10/36 | 15/36 | 21/36 | 26/36 | 30/36 | 33/36 | 35/36 | 36/36 |
Expected value & Variance
Expected value
For discrete random variable `X`, expected value, symbolized as `mu` or `E(X)`, is defined as:
| `mu=E(X)=sum_(x_i) x_i\ p(x_i)` | (4) |
If we compare (4) with normalized weighted mean :
`bar x=sum_(i=1)^n w_i\ x_i`
we can recognize that expected value is normalized weighted mean, in which pmf `p(x)` plays the role of normalized weights of values.
Variance
In general, variance, denoted as `sigma^2` or `V(X)`, is defined as:
`sigma^2=V(X)=E(X-mu)^2`(5)
For discrete random variable, variance is calculated by:
| `sigma^2=sum_(x_i) (x_i-mu)^2p(x_i)=sum_(x_i) x_i^2p(x_i)\ -\ mu^2` | (6) |
Example 4 : We continue Example 2 with expected value and variance.
To determine expected value, we extend Table 2 to obtain Table 4.
| `x` | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| `p(x)` | 1/36 | 2/36 | 3/36 | 4/36 | 5/36 | 6/36 | 5/36 | 4/36 | 3/36 | 2/36 | 1/36 |
| `xp(x)` | 2/36 | 6/36 | 12/36 | 20/36 | 30/36 | 42/36 | 40/36 | 36/36 | 30/36 | 22/36 | 12/36 |
From this result, we calculate expected value:
`mu=sum_(x_i) x_ip(x_i)=(2+6+12+20+30+42+40+36+30+22+12)/36=252/36=7`
To determine variance, we extend Table 2 and we obtain Table 5
| `x` | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| `p(x)` | 1/36 | 2/36 | 3/36 | 4/36 | 5/36 | 6/36 | 5/36 | 4/36 | 3/36 | 2/36 | 1/36 |
| `(x-mu)^2` | 25 | 16 | 9 | 4 | 1 | 0 | 1 | 4 | 9 | 16 | 25 |
| `(x-mu)^2p(x)` | 25/36 | 32/36 | 27/36 | 16/36 | 5/36 | 0 | 5/36 | 16/36 | 27/36 | 32/36 | 25/36 |
Therefore :
`sigma^2=sum_(x_i) (x_i-mu)^2p(x_i)=(25+32+27+16+5+0+5+16+27+32+25)/36=210/36=5,83`