Brane Space: An Introduction To Numerical Analysis (5): Least Squares Approximation

We are tasked in least squares approximation with fitting a straight line:

y = mx + b

To a series of experimentally observed points, corresponding to each of the observed values of x, e.g.

(x ₁ , y ₁), (x₂ , y₂), (x₃ , y₃)……. (x_n , y _n),

And corresponding to each of the observed values of x there are actually two values of y, the observed value y_obs and the value predicted by the straight line: mx_obs + b. And we call the difference: y_obs - mx_obs + b, a deviation. Each such deviation measures the amount by which the predicted value falls short of the observed value. Then the set of all such deviations, e.g.

D ₁= y ₁ - (m x ₁ + b), D₂= y₂ - (m x ₂ + b)

D₃ = y₃ - (m x₃ + b), D_n= y_n - (m x _n + b)

Gives an indication of the closeness of fit of the line y = mx + b to the data. For example, in the case of the graph shown below:

We have a graph in the form of F = mA + b, where F is the given frequency for a solar flare in a sunspot region and A is the area associated with the region. The defined line then represents the best fit to the assorted data: (A ₁ , F ₁), (A₂ , F₂), (A₃ , F₃)……. (A_n , F _n).

We say the line shown is a perfect fit if and only if all of the deviations are zero, i.e. D ₁= 0, D₂= 0, D₃ = 0, D_n= 0.

The problem then is to find the line which best fits a given set of data.

In general, for a straight line which comes close to fitting all of the observed points some of the Ds will be positive and some negative. However, the squares (D² )will all be positive so we have:

f(m,b) = (y ₁ - m x ₁ + b)² + (y₂ - m x₂ + b)² +.... (y_n - m _2n + b)²

This sum of square of the deviations depends on the choice of m and b but is never negative and can only be zero if m and b have values which produce a straight line that is a perfect fit. The method of least squares then says in effect: Take as the line y = mx + b of best fit, that for which the sum of squares of the deviations:

f(m,b) = D₁² + D₂² + D₃² + ...... + D_n² is a minimum. Which means solving the equations:

¶f/¶m = 0, ¶f/¶b = 0

Example problem: Find the straight line that best fits the points:

(0, 1), (1, 3), (2, 2), (3, 4), (4, 5)

Using the method of least squares.

Solution:

We proceed by first compiling the table below with relevant inputs:

Then: f (m,b) =

å (D² )= 55- 30b + 5b ² - 78m + 20mb +30m ²

¶f/¶m = -78 + 20b + 60m

¶f/¶b = -30 +10b + 20m

The value of m and b for which f(m,b) has a minimum must satisfy the simultaneous equations:

¶f/¶m = 0, 20b + 60m = 78

¶f/¶b = 0, 10b + 20m = 30

Then solving by subtracting bottom line from top:

20b + 60m = 78

10b + 20m = 30

------------------

10b + 40m = 48

From which we then obtain: 20m = 18 or m = 18/20 = 9/10 = 0.9

Then: 10b + 20 (9/10) = 30

Or: 10 b = 30 - 18 = 12 or b = 12/10 = 1.2

This leads to the best fit line: y = 0.9 x + 1.2

The graph of which is shown below fits amongst the points:

Suggested Problems:

1) Obtain the line: y = mx + b which best fits the following data points:

(0.10, 0.10), (0.20, 0.20), (0.30, 0.30), (0.40, 0.40), (0.50, 0.50)

2) Apply the method of least squares to obtain the line y = mx + b which best fits the points: (0,1), (1,2), (2,3)

3) In examining the frequency F of subflares within regions of sunspot area (A)* the following table of data is obtained:

Apply the method of least squares to obtain the line F = mA + b which best fits the points

* In millionths of a solar hemisphere.

Brane Space

Wednesday, July 26, 2023

An Introduction To Numerical Analysis (5): Least Squares Approximation

No comments:

Post a Comment