# Weighted least-squares

Series Geophysical References Series Problems in Exploration Seismology and their Solutions Lloyd P. Geldart and Robert E. Sheriff 9 295 - 366 http://dx.doi.org/10.1190/1.9781560801733 ISBN 9781560801153 SEG Online Store

## Problem 9.33

Find “best-fit” straight lines to the data in Table 9.33a.

 ${\displaystyle x\to }$ 0.21 0.49 0.71 1 1.42 1.73 2.03 2.47 ${\displaystyle t\to }$ 0.51 1.31 1.54 2.58 1.79 2.2 2.76 2.72 ${\displaystyle x\to }$ 3.05 3.09 3.28 3.64 3.7 3.84 4.07 4.24 ${\displaystyle t\to }$ 4.42 3.25 3.07 3.5 3.73 3.63 3.87 3.88
1. First plot the data and determine by eye a best-fit line,
2. Second, find the unweighted best-fit line by least-squares (i.e., weights of 1)
3. Find the least-squares best-fit line by weighting according to the vertical distances from the line in (a), and finally
4. By discarding the three wildest points (weighting them zero)

### Background

To fit a straight line ${\displaystyle t=ax+b}$ to a data set such as that in Table 9.33a, we can find the constants ${\displaystyle a}$ and ${\displaystyle b}$ such that the sum of the squares of the “errors” is minimized (see also problem 9.22). An error is the difference between an observed point and that predicted by the equation. If we wish to give added weight to some data points, usually because we consider them more reliable than other values, we give the error squared the weight ${\displaystyle w_{i}}$ as in equation (9.33a). Then we write the sum of the errors squared ${\displaystyle E}$ as

 {\displaystyle {\begin{aligned}E=\mathop {\sum } \limits _{i}^{}w_{i}[(ax_{i}+b)-t_{i}]^{2},\end{aligned}}} (9.33a)

and minimize ${\displaystyle E}$ by varying ${\displaystyle a}$ and ${\displaystyle b}$. This gives these equations:

 {\displaystyle {\begin{aligned}{\frac {\partial E}{\partial a}}=\mathop {\sum } \limits _{i}^{}w_{i}x_{i}\left[(ax_{i}+b)-t_{i}\right]=0,\end{aligned}}} (9.33b)

 {\displaystyle {\begin{aligned}{\frac {\partial E}{\partial b}}=\mathop {\sum } \limits _{i}^{}w_{i}\left[(ax_{i}+b)-t_{i}\right]=0.\end{aligned}}} (9.33c)

We rewrite these as simultaneous equations to be solved for ${\displaystyle a}$ and ${\displaystyle b}$:

 {\displaystyle {\begin{aligned}a\mathop {\sum } \limits _{i}^{}w_{i}x_{i}^{2}+b\mathop {\sum } \limits _{i}^{}w_{i}x_{i}=\mathop {\sum } \limits _{i}^{}w_{i^{x}i^{t}i},\end{aligned}}} (9.33d)

 {\displaystyle {\begin{aligned}a\mathop {\sum } \limits _{i}^{}w_{i}x_{i}+b\mathop {\sum } \limits _{i}^{}w_{i}\;=\mathop {\sum } \limits _{i}^{}w_{i}t_{i},\end{aligned}}} (9.33e)

where ${\displaystyle \mathop {\sum } \limits {_{i}}w_{i}={\hbox{sum of the weights}}}$.

Curves other than a straight line can be fit to data sets in a similar manner. Other definitions of “best fit” can also be used. Additional constraints, for example, that the curve should pass through the origin, can also be added.

Figure 9.33a.  Straight-line fits to the data.

### Solution

The data are plotted in Figure 9.33a and the calculations given in Table 9.33c. The best-fit line determined by eye is shown by the dashed line; its equation is.

{\displaystyle {\begin{aligned}t=1.00+0.71x.\quad \mathrm {Eye-ball\ fit} .\end{aligned}}}

The line for equal weighting shown by the solid line has the equation.

{\displaystyle {\begin{aligned}t=1.040+0.721x.\quad \mathrm {Equal\ weighting} \quad w_{b}.\end{aligned}}}

The line giving increased weighting to data that lie closer to the equal-weighting line is shown by short dashes; its equation is

{\displaystyle {\begin{aligned}t=0.989+0.700x.\quad \mathrm {Weighting\ by\ proximity\ to\ eye-ball\ line} \ w_{c}.\end{aligned}}}

If we simply throw away the three points that lie farthest away, (${\displaystyle w_{d}=0}$ in Table 9.33b) we get the equation (not plotted)

{\displaystyle {\begin{aligned}t=1.041+0.683x.\quad \mathrm {Discarding\ three\ wild\ points,\ weights} \quad w_{d}.\end{aligned}}}

 ${\displaystyle x_{i}}$ ${\displaystyle t_{i}}$ ${\displaystyle x_{2}^{i}}$ ${\displaystyle x_{i}t_{i}}$ ${\displaystyle w_{b}}$ ${\displaystyle w_{c}}$ ${\displaystyle w_{d}}$ 0.21 0.51 0.04 0.11 1 1 0 0.49 1.31 0.24 0.64 1 5 1 0.71 1.54 0.50 1.09 1 5 1 1.00 2.58 1.00 2.58 1 1 0 1.42 1.79 2.02 2.54 1 3 1 1.73 2.20 2.99 3.81 1 5 1 2.03 2.76 4.12 5.60 1 2 1 2.47 2.72 6.10 6.72 1 5 1 3.05 4.42 9.30 13.48 1 1 0 3.09 3.25 9.55 10.04 1 4 1 3.28 3.07 10.76 10.07 1 3 1 3.64 3.50 13.25 12.74 1 4 1 3.70 3.73 13.69 13.80 1 3 1 3.84 3.63 14.75 13.94 1 4 1 4.07 3.87 16.56 15.75 1 5 1 4.24 3.88 17.98 16.45 1 4 1 ${\displaystyle Sums_{b}}$ 38.97 44.76 123.13 129.37 16 13 ${\displaystyle Sums_{c}}$ 140.11 152.41 452.21 454.91 55

The changes in values are ${\displaystyle <5\%}$ (standard deviation 3%) and the different weighting schemes make relatively little difference in this example.

Eye-ball fit

${\displaystyle b=1.00,\;a=0.71}$

 Equations for equal weighting line: ${\displaystyle 123.13a_{b}+38.97b_{b}=129.37;}$ ${\displaystyle 38,97a_{b}+16b_{b}=44.76;}$ ${\displaystyle b_{b}=1.040;\quad a_{b}=0.721}$. Weighting by proximity to above line: ${\displaystyle 452.21a_{c}+140.11b_{c}=454.91,}$ ${\displaystyle 140.11a_{c}+55b_{c}=152.41,}$ ${\displaystyle b_{c}=0.989,\quad a_{c}=0.700}$. Throwing away 3 wild points: ${\displaystyle 112.79a_{d}+34.71b_{d}=113.20,}$ ${\displaystyle 34.71a_{d}+13b_{d}=37.25}$ ${\displaystyle b_{d}=1.041,\quad a_{d}=0.683}$.

Previous section Next section
Effects of normal-moveout (NMO) removal Improvement due to amplitude preservation
Previous chapter Next chapter
Reflection field methods Geologic interpretation of reflection data