**The Linear Least
Squares Minimization Problem**

When we conduct an experiment we usually end up with measured data from which we would like to extract some information. Frequently the task is to find whether a particular model fits the data, or what combination of model data does describe the experimental data set best. This may be possible in a single minimization step or may require several cycles of successive refinement (as often the case in crystallography).

**Example of a CD Spectra
Fit**

Let us discuss the simple case of a CD spectra
fit. In a first approximation, a CD spectrum of a
protein or polypeptide can be treated
as a sum of three components: *a*-helical, *b*-sheet,
and random coil contributions to the spectrum. At each wavelegth,
the ellipticity (theta) of the spectrum will contain a linear
combination of these components:

(1)

where is the total measured susceptibility, the
contribution from helix, s for sheet, c for coil, and the
corresponding *x *the fraction of this contribution.

As we have three unknowns in this equation, a measurement at 3
points (different wavelengths) would suffice to solve the problem
for *x*, the fraction of each contribution to the total
measured signal. However, due to experimental error and imperfect
model data the choice of points can mightily influence the result
(a common source of subtle abuse in data interpretation). We
usually have many more data points available from our measurement
(e.g., a whole CD spectrum, sampled at 1 nm intervals from 190 to
250 nm). In this case, we can try to minimize the total deviation
between all data points and calculated model values. This is done
by a minimization of the sum of residuals squared (s.r.s.) which
looks as follows in our case :

(2).

A function has an extremum where its derivative is zero, i.e.
in the one-dimensional case where the tangent to the graph is
horizontal (zero slope, *dx/dy = 0*). The little icon on
top shows a two-dimensional example, with a horizontal tangential
plane defining the minimum for *f(x,y)*. A minimum of our
error function (i.e. the *least *error) can be found by
setting the 3 partial derivatives to zero :

(3).

Equation (2) is easy to derivatize by following the chain rule (or you can multipy eqn.3 out, or factor it and use the product rule). The following shows the derivation for x1

(4)

With equation (4) and its analogues for x2 and x3, we have now 3 equations in 3 unknowns, which can be solved. We re-write the derivatives as normal equations (given here for x1 only)

which set up the matrix A of the normal equiations with its elements

etc. and vector b with etc.

We can solve the resulting general simultaneous equation problem

(8)

easily by a standard method such as Gauss-Jordan elimination [1]. The computer routine I selected returns the solution vector in b (i.e., the return values of b are the minimized values for x1, x2, and x3) and the inverse of A = A' in place of A. A' is also known as the correlation matrix C. The diagonal elements C(i,i) are the variance (squared standard deviation) of the fitting variables x, and the off-diagonal elements the correlation coefficients between them. However, when no weights are applied (in our example we do not know the standard deviation of the individual measurement), this value is not really meaningful. Sometimes eqn.(8) is written as

where which of course is equivalent (check!).

In some instances one might want to impose restraints and constraints on the problem. For example, nothing prevents our method from returning a negative value for one of the xes. This is mathematically perfectly ok, but does not make physical sense, there can't be a negative amount of, for example, helix in the sample. There are several texts that deal with these extensions to the LSQ method, a classic one is available from the ACA [2]. A good description of the general LSQ problem, including the application of weights, can be found in [1]. Of course, there is also the option to solve eqn.(8) by linear programming [1], which naturally imposes the constraint of positivity on the solutions x (set up one equation as target function and the other 2 as constraints).

In the implementation section we shall deal with the problem of negative fitting coefficients differently by using an iterative numerical method. As we do not have standard deviations for the data points, and thus no meaningful variance for our fitting parameters, we evaluate the r.m.s. deviation and an R-value as quality indicators for our fit.

[1] W.H.Press, S.A.Teukolsky, W.T.Vettering and B.P. Flannery,
Numerical Recipes, 2nd edition, Cambridge University Press (1992)

[2] *Least Squares Tutorial*, ACA lecture note #1,
American Crystallographic Association (1974)

**Back to Introduction**

**This World Wide Web site conceived and maintained by
Bernhard Rupp**

Last revised
Dezember 27, 2009 01:40