logo 2uData.com

The previous pageSimple linear regressionThe next page

Linear regression is widely used thanks to its simplicity, ease of use, good conformity to a lot of phenomena. In this page, we consider the simple case with two variables.

Consider two variables `X` and `Y`, and `n` pairs of data (`x_i`, `y_i`). We have to find out two coefficients `a` and `b` for the equation `y=ax+b` of the line satisfying least squares condition (Fig. 1).

Fig. 1 Illustration of linear regression

Using (2) to calculate `SS_E`:

`SS_E=sum_(i=1)^n (y_i-ax_i-b)^2`(6)

Coefficients a and b are determined by system of equations:

`(partial SS_E)/(partial a)=-2sum_(i=1)^n (y_i-ax_i-b)x_i=0`(7)
`(partial SS_E)/(partial b)=-2sum_(i=1)^n (y_i-ax_i-b)=0`(8)

Rearrange this system of equations:

`nb+asum_(i=1)^n x_i=sum_(i=1)^n y_i`(9)
`bsum_(i=1)^n x_i+asum_(i=1)^n x_i^2=sum_(i=1)^n x_iy_i`(10)

Solve system of equations (9) and (10), we get:

`a=(nsum xy-sum x sum y)/(nsum x^2-(sum x)^2)`(11)
`b=(sum y sum x^2-sum x sum xy)/(n sum x^2-(sum x)^2)`(12)

Example

The heights and weights of 10 students are shown in Table 1.

Table 1 Heights and weights of 10 students
1 2 3 4 5 6 7 8 9 10
Height (m) 1,57 1,62 1,58 1,64 1,74 1,68 1,71 1,60 1,66 1,64
Weight (kg) 55 72 52 65 82 79 78 66 78 71

Symbolize `X` as height and `Y` as weight. To find out the linear regression equation for `X` and `Y` we calculate the components in equations (11) and (12). Result is shown in Table 2.

Table 2 Result of calculation components of equations (11) and (12).
1 2 3 4 5 6 7 8 9 10 `sum`
`x` 1,57 1,62 1,58 1,64 1,74 1,68 1,71 1,60 1,66 1,64 16,44
`y` 55 72 52 65 82 79 78 66 78 71 698
`x^2` 2,4649 2,6244 2,4964 2,6896 3,0276 2,8224 2,9241 2,56 2,7556 2,6896 27,0546
`xy` 86,35 116,64 82,16 106,6 142,68 132,72 133,38 105,6 129,48 116,44 1152,05

Using formulae (11) and (12) to calculate coefficients `a` and `b`:

  `a=((10xx1152,5)-(16,44xx698))/((10xx27,0546)-(16,44)^2)=166,593`

  `b=((698xx27,0546)-(16,44xx1152,05))/((10xx27,0546)-(16,44)^2)=-204,079`

Therefore, the relationship between height and weight of 10 students can be represented by formula:

  `y=166,593x-204,079`



The previous pageThe first page of chapterThe next page


This web page was last updated on 03 December 2018.