logo 2uData.com

The previous pageMeasures of dispersionThe next page

In many cases, using only a measure of central tendency to characterize a variable is not enough. In general, we need at least a measure of dispersion. In this page, we examine two such quantities: variance and standard deviation, and glance over other measures.

Variance & Standard deviation

 

For a population of `N` elements, variance is calculated by:

`sigma^2=(sum_(i=1)^N (x_i-mu)^2)/N`(8)

For a sample of `n` elements, variance is determined by:

`s^2=(sum_(i=1)^n (x_i-bar x)^2)/(n-1)`(9)

In formulae (8) and (9):

  • `x_i` is the value of an arbitrary element of population,
  • `mu` is the mean of population,
  • `sigma` is the standard deviation of population,
  • `s` is the standard deviation of sample,
  • `n-1` is defined as degree of freedom.

Hence, we can consider standard deviation as average difference between elements and their mean (`mu` or `bar x`)

In order to facilitate the manual calculation of variance and standard deviation, we can use the formula:

`sum_(i=1)^n (x_i-bar x)^2=sum_(i=1)^n x_i^2-1/n(sum_(i=1)^n x_i)^2`(10)

Rounding rule for standard deviation

The rounding rule for standard deviation is the same as that for the mean. It means that standard deviation should be rounded to one more decimal place than number with smallest decimal place occurs in the original data.

For example, standard deviation of the sample consists of numbers
  1,1 ; 2,22 ; 3,333 ; 4,4444 ; and  5,55555
  is 1,76

Symbols of parameters and statistic

In order to distinguish parameter (for population) and statistic (for sample), we use the symbols in Table 1.

Table 1 Symbols for parameters and statistics
For population For sample
Mean `mu` `bar x`
Proportion `pi` `p`
Variance `sigma^2` `s^2`
Standard deviation `sigma` `s`

Other measures of dispersion

 

Range

Range is the difference between the highest and the lowest values of a variable:

`R=x_max-x_min`(11)

Coefficient of variation

This quantity is used to compare the variation of different data sets, especially when there are large differences in values of these sets.

Coefficient of variation of a numerical variable is defined as the ratio between its standard deviation and mean.

+ For a population :`CV=sigma/mu`(12)
+ For a sample :`CV=s/(bar x)`(13)


The previous pageThe first page of chapterThe next page


This web page was last updated on 01 December 2018.