Curve fitting is finding a curve which matches a series of data points and possibly other constraints. This section is an introduction to both interpolation (where an exact fit to constraints is expected) and regression analysis. Both are sometimes used for extrapolation. Regression analysis allows for an approximate fit by minimizing the difference between the data points and the curve.
Let's start with a first degree polynomial equation:
This is a line with slope a. We know that a line will connect any two points. So, a first degree polynomial equation is an exact fit through any two points.
If we increase the order of the equation to a second degree polynomial, we get:
This will exactly fit three points.
If we increase the order of the equation to a third degree polynomial, we get:
This will exactly fit four points.
A more general statement would be to say it will exactly fit four constraints. Each constraint can be a point, angle, or curvature (which is the reciprocal of the radius of an osculating circle). Angle and curvature constraints are most often added to the ends of a curve, and in such cases are called end conditions. Identical end conditions are frequently used to ensure a smooth transition between polynomial curves contained within a single spline. Higher-order constraints, such as "the change in the rate of curvature", could also be added. This, for example, would be useful in highway cloverleaf design to understand the forces applied to a car, as it follows the cloverleaf, and to set reasonable speed limits, accordingly.
Bearing this in mind, the first degree polynomial equation could also be an exact fit for a single point and an angle while the third degree polynomial equation could also be an exact fit for two points, an angle constraint, and a curvature constraint. Many other combinations of constraints are possible for these and for higher order polynomial equations.
If we have more than n + 1 constraints (n being the degree of the polynomial), we can still run the polynomial curve through those constraints. An exact fit to all the constraints is not certain (but might happen, for example, in the case of a first degree polynomial exactly fitting three collinear points). In general, however, some method is then needed to evaluate each approximation. The least squares method is one way to compare the deviations.
Now, you might wonder why we would ever want to get an approximate fit when we could just increase the degree of the polynomial equation and get an exact match. There are several reasons:
Now that we have talked about using a degree too low for an exact fit, let's also discuss what happens if the degree of the polynomial curve is higher than needed for an exact fit. This is bad for all the reasons listed previously for high order polynomials, but also leads to a case where there are an infinite number of solutions. For example, a first degree polynomial (a line) constrained by only a single point, instead of the usual two, would give us an infinite number of solutions. This brings up the problem of how to compare and choose just one solution, which can be a problem for software and for humans, as well. For this reason, it is usually best to choose as low a degree as possible for an exact match on all constraints, and perhaps an even lower degree, if an approximate fit is acceptable.
For more details, see the polynomial interpolation article.
Other types of curves, such as conic sections (circular, elliptical, parabolic, and hyperbolic arcs) or trigonometric functions (such as sine and cosine), may also be used, in certain cases. For example, trajectories of objects under the influence of gravity follow a parabolic path, when air resistance is ignored. Hence, matching trajectory data points to a parabolic curve would make sense. Tides follow sinusoidal patterns, hence tidal data points should be matched to a sine wave, or the sum of two sine waves of different periods, if the effects of the Moon and Sun are both considered.
Note that while this discussion was in terms of 2D curves, much of this logic also extends to 3D surfaces, each patch of which is defined by a net of curves in two parametric directions, typically called u and v. A surface may be composed of one or more surface patches in each direction.
For more details, see the computer representation of surfaces article.
Implementations
Software
Online calculators and demos
Online textbooks
Commercial/Shareware