Dr. Ernst Hairer accepted the invitation on 3 October 2008 (self-imposed deadline: 3 April 2009).
Linear multistep methods constitute an important class of numerical integrators for ordinary differential equations, and particular methods are well suited for solving non-stiff and stiff equations as well as Hamiltonian systems over long time intervals. In contrast to one-step methods, multistep methods increase efficiency by using the information from several previous solution approximations.
Contents |
Consider an initial value problem \[\tag{1} \dot y=f(t,y), \quad y(t_0)=y_0 , \]
where the solution \(y(t)\) can be scalar- or vector-valued. Let \(t_0,t_1,t_2,\ldots\) be a sequence of grid points, which for simplicity is supposed to be equidistant with step size \(h=t_{n+1}-t_n\ .\) A numerical method is an algorithm that yields approximations \(y_n\) to the solution \(y(t_n)\) at the grid points.
These methods are introduced by J.C. Adams (1883) for solving practical problems of capillary action. They are based on the following idea: suppose that \(k\) values \( y_{n-k+1},\,y_{n-k+2},\ldots, y_{n}\) approximating \(y(t_{n-k+1}),\, y(t_{n-k+2}),\ldots,y(t_{n}) \) are known. Compute the derivatives \[\tag{2} f_{n+j}=f(t_{n+j},y_{n+j}), \quad j=-k+1,\ldots ,0 \]
and replace in the integrated form of (1) \[\tag{3} y(t_{n+1})=y(t_{n})+\int _{t_{n}}^{t_{n+1}}f(t,y(t))\,dt \]
the integrand \(f(t,y(t))\) by the polynomial \(p(t)\) interpolating the values (2). Then evaluate the integral analytically and obtain the next approximation to the solution, \(y_{n+1}\ .\) After advancing the scheme by one step, this procedure can be repeated to obtain \(y_{n+2},\, y_{n+3}\ ,\) and so on. See Figure 1 for an illustration using the logistic growth equation \(\dot y=a y (1-y)\ .\)
Newton's formula for polynomial interpolation can be written as \(\tag{4} p(t) =f_n + \frac{1}{h}(t-t_n)\nabla f_n + \frac{1}{2 h^2}(t-t_n)(t-t_{n-1})\nabla^2f_n + \ldots \)
where \(\nabla\) is the backward difference operator \(\nabla f_n = f_n - f_{n-1}\ ,\) \(\nabla^2 f_n = \nabla f_n - \nabla f_{n-1} = f_n -2f_{n-1}+f_{n-2}\ ,\) etc. This leads to the following equations (for more details see Hairer, Nørsett & Wanner (1993), p. 357-358): \[\tag{5} \begin{array}{ll} k=1:\quad & y_{n+1}=y_n+hf_n\\ k=2: \quad & y_{n+1}=y_n+h\bigl( {3\over 2}f_n- {1\over 2}f_{n-1}\bigr)\\[1mm] k=3: \quad & y_{n+1}=y_n+h\bigl( {23\over 12}f_n-{16\over 12}f_{n-1}+{5\over 12}f_{n-2} \bigr)\\[1mm] k=4: \quad & y_{n+1}=y_n+h\bigl( {55\over 24}f_n-{59\over 24}f_{n-1}+{37\over 24}f_{n-2} -{9\over 24}f_{n-3}\bigr) . \end{array} \]
If \(f_{n+1}=f(t_{n+1},y_{n+1})\) were known, the interpolation polynomial could be stretched over the entire interval \(t_{n-k+1},\ldots , t_{n+1}\) and thereby an improved precision would be obtained. The corresponding formulas have the form \[\tag{6} \begin{array}{ll} k=0:\quad & y_{n+1}=y_n+hf_{n+1} \\[1mm] k=1: \quad & y_{n+1}=y_n+h\bigl( {1\over 2}f_{n+1}+ {1\over 2}f_n\bigr)\\[1mm] k=2: \quad & y_{n+1}=y_n+h\bigl( {5\over 12}f_{n+1}+{8\over 12}f_n-{1\over 12}f_{n-1} \bigr)\\[1mm] k=3: \quad & y_{n+1}=y_n+h\bigl( {9\over 24}f_{n+1}+{19\over 24}f_n -{5\over 24}f_{n-1} +{1\over 24}f_{n-2}\bigr) . \end{array} \]
and represent an implicit equation for computing \(y_{n+1}\ .\) The special cases \(k=0\) and \(k=1\) are the implicit Euler method and the trapezoidal rule, respectively. They are actually one-step methods.
An elegant way of solving the implicit equations in (6) is to predict a value \(\widehat y_{n+1}\) by the explicit method (5) with the same value of \(k\ ,\) and replace in the corrector (6) the missing value \(f_{n+1}\) by \(\widehat f_{n+1}=f(t_{n+1},\widehat y_{n+1})\ .\) This is the commonly used implementation of the implicit Adams methods for non-stiff problems. The complete algorithm created by this idea is illustrated in Figure 2.
For theoretical investigations it is convenient to use a general framework that comprises all explicit and implicit Adams methods as well as the BDF schemes considered below. Following the seminal work of Dahlquist (1956), a linear multistep method (linear \(k\)-step method) is written in the form \[\tag{7} \sum_{j=0}^k \alpha_j \,y_{n+j} =h \sum_{j=0}^k \beta _j \, f_{n+j} \]
(notice that the indices are shifted when compared to the formulas (5) and (6)). The generating polynomials for its coefficients, \[\tag{8} \rho (\zeta )=\sum_{j=0}^k \alpha_j \,\zeta^j , \qquad \sigma (\zeta ) = \sum_{j=0}^k \beta _j \,\zeta^j \]
play an important role in the theory for these methods. Adams methods are characterized by the fact that \(\rho (\zeta ) = (\zeta -1)\,\zeta^{k-1}\ .\) A linear multistep method is called consistent if it reproduces the exact solution for the differential equation \(\dot y =1\ ,\) when exact starting approximations are used. This is equivalent to \(\rho (1)=0\) and \(\rho ' (1)=\sigma (1)\ .\) If it is exact for the differential equations \(\dot y =m x^{m-1}\) with \(m=1,2,\ldots ,r\ ,\) the method is said to have order \(r\ .\) In terms of the generating polynomials, this means that \[ \rho (e^h ) - h \sigma (e^h ) = C_{r+1} h^{r+1} + O (h^{r+2}) \qquad \mbox{for}\quad h\to 0 . \] The accuracy of methods having the same order can be distinguished by their error constant \[ C = \frac{C_{r+1}}{\sigma (1)}. \]
The basic theory for linear multistep methods has been developed by Dahlquist (1959) and is presented in the classical book of Henrici (1962). The main difference to one-step methods is that consistency alone does not guarantee convergence. A linear multistep method is called stable (or "zero-stable"), if for the differential equation \(\dot y =0\) all solutions of the difference equation (7) remain bounded. This is equivalent to the property that all zeros of \(\rho (\zeta )\) satisfy \(|\zeta |\le 1\) with those lying on the unit circle being simple. The highest attainable order of linear \(k\)-step methods is \(2k\ .\) However, for \(k\ge 3\) these methods are of no practical use because of the following result.
First Dahlquist barrier. The order \(r\) of a stable linear \(k\)-step method satisfies
Convergence theorem. If a linear multistep method (7) is stable and of order \(r\ ,\) then it is convergent of order \(r\ .\)
Convergence of order \(r\) means that for sufficiently accurate starting approximations \(y_0,\ldots ,y_{k-1}\) the global error satisfies \(y_{n}-y(t_n) = O (h^r)\) on compact intervals \([0,nh]\ ,\) where \(nh \le T\ .\) The constant symbolized by the \(O (h^r)\) notation is bounded by \(\exp (TL)\ ,\) where \(L\) is a Lipschitz constant of the vector field \(f(t,y)\ .\)
Continuing the computation of the above logistic growth equation \(\dot y=a y (1-y)\ ,\) using \(ha= 5/7\ ,\) with the explicit Adams process of order 3 leads to a bad surprise. Instead of approaching the steady-state solution \(y\approx 1\ ,\) the numerical solution exhibits a violent instability phenomenon (see Figure 3).
Stiff differential equations are characterized by the fact that the use of explicit methods requires very small step sizes, and certain implicit methods (like the implicit Euler method) perform much better. Stiff differential equations arise frequently as singularly perturbed problems in chemical reaction systems and in electrical circuitry, and as space discretizations of parabolic partial differential equations.
The foregoing stability analysis for nonstiff problems, which is uniquely based on the polynomial \(\rho(\zeta)\ ,\) is not sufficient for stiff differential equations. There are many attempts for studying the stability of methods for stiff problems. The most popular approach, initiated by Dahlquist (1963), is motivated by linearization of \(\dot y = f(y)\) in the neighborhood of a stationary solution, translation and, in the case of systems, diagonalization. It is based on \[\tag{9} \dot y=\lambda y , \]
where \(\lambda\) represents an eigenvalue of the Jacobian of \(f(y)\) and can be a complex number. In the above case, we have \(\lambda =-a=-5/7\ .\) Method (7) applied to this equation gives \(\sum_{j=0}^k (\alpha_j-h\lambda\beta _j) y_{n+j} =0\ ,\) whose stability depends on the roots of
\[\tag{10} \rho(\zeta) -\mu\sigma(\zeta)=0 ,\qquad \mu=h\lambda . \]
The stability domain \(S\) is the set of complex numbers \(\mu\ ,\) for which all roots of (10) satisfy \(|\zeta |\le 1\) with those lying on the unit circle being simple. For the explicit Adams method with \(k=3\ ,\) equation (10) becomes \[\tag{11} { \zeta ^3-\zeta ^2-\mu\Bigl( \frac{23}{12}\zeta ^2-\frac{16}{12}\zeta+\frac{5}{12}\Bigr) =0}, \]
or \[\tag{12} \mu=\frac{\zeta ^3-\zeta ^2}{\frac{23}{12}\zeta ^2-\frac{16}{12}\zeta +\frac{5}{12}} . \]
An elegant way of avoiding the high degree equation in (11) is to insert the limiting values \(\zeta=e^{i\phi}\ ,\) for \(0\le\phi\le 2\pi\) real, into (12). This leads to the root locus curve for \(\mu\ ,\) parts of which must form the boundary of the stability domain \(S\ .\) These domains are drawn for four explicit Adams methods in Figure 4 and shrink rapidly with increasing \(k\ .\) In the example from the beginning of this section, the value of \(\mu\) was \(h\lambda=-5/7\ ,\) far beyond the stability bound \(-6/11\ .\)
Implicit Adams methods for \(k\ge 2\) have finite stability domains as well,ƒ but they are considerably larger than those for explicit Adams methods. However, if they are solved by a predictor-corrector approach, a good deal of the stability is lost, as seen in Figure 5 for the case \(k=3\ .\)
This class of methods was discovered, together with the phenomenon of stiffness, by Curtiss & Hirschfelder (1952) in calculations for chemistry, and their extreme importance for stiff problems has been recognized since the work of Gear (1971). In contrast to the Adams methods, where polynomial interpolations were used for the slopes \(f_i\ ,\) we here interpolate the \(y\)-values \(y_{n-k+1},\ldots ,y_n\) and require for the interpolating polynomial \(q(t)\) the collocation condition \(\dot q (t_{n+1})= f(t_{n+1},q(t_{n+1}))\ .\) The differentiation of Newton's interpolation formula (4) then leads to the methods \[\tag{13} \sum_{j=1} ^k {1 \over j}\nabla^j y_{n+1} = hf_{n+1} , \]
(\(\nabla\) denotes again the backward difference operator) or, when expanded, \[\tag{14} \begin{array}{ll} k =1:\quad & y_{n+1} - y_n = hf_{n+1} \\[1mm] k =2:\quad & {3 \over 2}y_{n+1} - 2y_n + {1 \over 2}y_{n-1} = hf_{n+1} \\[1mm] k =3:\quad & {11 \over 6}y_{n+1} - 3y_n + {3 \over 2}y_{n-1} - {1 \over 3}y_{n-2} = hf_{n+1} \\[1mm] k =4:\quad & {25 \over 12}y_{n+1} - 4y_n + 3y_{n-1} - {4 \over 3}y_{n-2} + {1 \over 4}y_{n-3} = hf_{n+1} \\[1mm] k =5:\quad & {137 \over 60}y_{n+1} - 5y_n + 5y_{n-1} - {10 \over 3}y_{n-2} + {5 \over 4}y_{n-3} - {1 \over 5}y_{n-4} = hf_{n+1} \\[1mm] k =6:\quad & {147 \over 60}y_{n+1} - 6y_n + {15 \over 2}y_{n-1} - {20 \over 3}y_{n-2} + {15 \over 4}y_{n-3} - {6 \over 5}y_{n-4} + {1 \over 6}y_{n-5} = hf_{n+1} . \end{array} \]
These are implicit methods and work, for \(k\le 6\ ,\) significantly better for stiff problems, as illustrated in the movie of Figure 6. For \(k\ge7\) they become unstable. Their stability domains, always the outside of the root locus curve, extend to \(-\infty \) and can be seen in Figure 7. For stiff problems, their implementation requires some Newton-like procedures for the implicit equations, because a predictor-corrector approach would destroy this good stability property. For more details we refer to the entry on BDF methods. These methods are furthermore a very important tool for the treatment of differential-algebraic equations.
The analytical solution of Dahlquist's test equation (9) is stable if and only if \(\hbox{Re}\lambda\le 0\ .\) A very desirable property of a numerical method is that, under this condition, the numerical solution remains stable as well, i.e., that \[\tag{15} S\supset {\Bbb C}^- . \]
This property is called A-stability (Dahlquist 1963). Inspecting the above stability domains, we observe that the implicit Adams methods for \(k=1,2\) (which are the implicit Euler and trapezoidal method) as well as the BDF methods for \(k=1,2\) (the first one is implicit Euler again) have this property. A close inspection of the stability domains for the higher order BDF methods shows that some parts close to the imaginary axis, which correspond to oscillatory problems, do not produce stable solutions. This observation is part of the famous second Dahlquist barrier (Dahlquist 1963):
Second Dahlquist barrier. An A-stable multistep method must be of order \(r\le 2\ .\) Among all multistep methods of order 2, the trapezoidal rule has the smallest error constant.
This severe order bound was the origin of numerous efforts for breaking it by generalizing multistep methods in various directions: Enright's second derivative methods, second derivative BDF methods, blended multistep methods (Skeel & Kong), super future points (Cash 1980), multistep collocation methods, multistep methods of Radau type, which all fall into the class of general linear methods. For more details, consult Section V.3 of Hairer & Wanner (1996). Also these methods, if A-stable, have their order barrier, which was for many years an open conjecture (Daniel-Moore 1968): An A-stable generalized multistep method, with \(s\) implicit stages per step, must be of order \(r\le 2s\ .\) The smallest error constant for such methods is that of the corresponding Gauss-Runge-Kutta method.
Order stars. An elegant idea for proving the second Dahlquist barrier and the Daniel-Moore conjecture, together with many other stability results, is the consideration of the corresponding order star \(A\ .\) Instead of the stability domain \(S\ ,\) where the complex roots \(\zeta (\mu)\) of (10) were compared, in absolute values, to 1, we now compare them to \(|e^{\mu}|\ ,\) the absolute value of the exact solution, which they tend to approximate, i.e., \(A=\{ \mu\, ;\, |\zeta (\mu)| >|e^{\mu}| \} \ .\) Since \(\zeta (\mu)\) is multi-valued, this set has to be considered on the associated Riemann surface. The order star gives much insight into the structure of the function \(\zeta(\mu)\ ,\) because it transforms analytic properties into geometric properties as follows: the star-shaped structure with \(r+1\) sectors at the origin indicates that the method is of order \(r\ ,\) and A-stability is characterized by the fact that poles are in the right half-plane and the intersection of \(A\) with the imaginary axis is empty. The order star for the BDF3 method is drawn in Figure 8, and explains that such a method of order 3 cannot be A-stable. Details can be found in Hairer & Wanner (1996). The use of order arrows (Butcher 2008) is another way for observing that BDF3 is not A-stable.
A\((\alpha )\)-stability. This is a concept, weaker than A-stability, but nevertheless very useful. It requires that the stability domain contains a sector with angle \(2 \alpha \) in the negative half plane; more precisely, a linear multistep method (7) is called A\((\alpha )\)-stable if \(S\supset \{ z\, ; \, |\arg (-z)|<\alpha, \, z\ne 0\}\ .\) BDF methods are A\((\alpha )\)-stable with angle given by \[ \begin{array}{c|cccccc} k~&1&2&3&4&5&6\\[3pt] \alpha~&~90^\circ~&~~90^\circ&86.03^\circ&73.35^\circ&51.84^\circ&17.84^\circ \end{array} \]
The interest of applying BDF methods (or other A-stable or A\((\alpha )\)-stable multistep methods) relies in the fact that they can be applied to stiff differential equations with step sizes that are much larger than the inverse of a Lipschitz constant of \(f(y)\ .\) In this situation, classical convergence theory (for non-stiff problems) does not provide any information. It is nevertheless possible to prove estimates \(y_n-y(t_n) =O(h^r)\) of the global error (for sufficiently accurate starting approximations and for A\((\alpha )\)-stable methods of order \(r\)) in the following situations:
Precise statements and proofs can be found, for example, in Sections V.7, V.8, and VI.2 of the monograph Hairer & Wanner (1996).
For long-time integration of conservative differential equations special integrators are usually more appropriate. The study of such integrators is the main topic of geometric numerical integration (Hairer, Lubich & Wanner 2006), and there are interesting geometric integrators among multistep methods.
An important class of conservative systems are second order differential equations \[\tag{16} \ddot q = f(q), \qquad f(q)=-M^{-1}\nabla U(q) \]
with a constant, symmetric, positive definite mass matrix \(M\ ,\) and a smooth potential \(U(q)\ .\) This equation is Hamiltonian with energy \(\textstyle H(p,q)={1\over 2}\, p^{\mathsf T} M^{-1} p +U(q)\ ,\) where \(p=M\dot q\) is the momentum. The exact flow is known to be symplectic, and the energy is preserved along solutions. Typical examples are \(N\)-body systems, e.g., planetary motion in astronomy or simulations in molecular dynamics.
It is possible to write (16) as a first order system and to apply any multistep method considered above. Eliminating the velocity leads to a formula \[\tag{17} \sum_{j=0}^k \alpha_{j}\,q_{n+j} = h^2 \sum_{j=0}^k \beta_{j}\, f(q_{n+j}) , \]
with coefficients \(\alpha_{j},\beta_{j}\) that are different from those above. We again introduce generating polynomials \[ \rho (\zeta ) = \sum_{j=0}^k \alpha_{j}\, \zeta^j , \qquad \sigma (\zeta ) = \sum_{j=0}^k \beta_{j}\, \zeta^j . \] A linear multistep method (17) has order \(r\) for differential equations (16) if \[ \rho (e^h ) - h^2 \sigma (e^h ) = O (h^{r+2}) \qquad \hbox{for}\quad h\to 0 , \] and it is stable if all zeros of \(\rho (\zeta )\) satisfy \(|\zeta |\le 1\) with those lying on the unit circle being at most double. Order \(r\) and stability guarantee convergence, and the global error satisfies \(q_{n}-q(t_{0}+nh) = O (h^r)\) on compact intervals \(nh \le T\ .\)
For the integration of Hamiltonian systems, we are mainly interested in long-time integration. For example, the solution of the harmonic oscillator \(\ddot q = -\omega^2 q\) satisfies \(\textstyle\frac 12 (\dot q(t) ^2 + \omega^2 q(t)^2)= const\) and remains bounded for all times. Can we have a similar behavior for the numerical solution, when \(h\) is fixed and \(n\to\infty\ ?\) When applied to the harmonic oscillator, method (17) becomes a linear difference relation with characteristic equation \(\rho (\zeta ) - (h\omega)^2 \sigma (\zeta ) =0\ .\) It turns out that near energy conservation for the harmonic oscillator can be achieved only if all roots have modulus one. This in turn implies that the method has to be symmetric, i.e., \[\tag{18} \alpha_{j} = \alpha_{k-j}, \qquad \beta_{j}=\beta_{k-j}. \]
Lambert & Watson (1976) initiated the study of the long-time behavior of symmetric multistep methods. Quinlan & Tremaine (1990) constructed high order methods (17) and applied them successfully to long-time integrations of the outer solar system. One of their methods of order \(8\) with \(k=8\) is given by \[\tag{19} \begin{array}{rcl} \rho (\zeta ) &\!\! = \!\! & (\zeta -1)^2 (\zeta^6 + 2\zeta^5 +3\zeta^4 + 3.5 \zeta^3 + 3\zeta^2 +2\zeta +1) ,\\[2mm] 120960\,\sigma (\zeta ) &\!\! = \!\! & 192481 (\zeta^7 + \zeta ) + 6582 (\zeta^6 +\zeta^2) +816783 (\zeta^5 + \zeta^3) - 156812 \zeta^4 . \end{array} \]
Notice that the method is explicit (i.e., \(\beta_{k}=0\)) and therefore its implementation with constant steps is straight-forward. Due to possible numerical resonances, symmetric multistep methods are recommended only for computations with high accuracy.
For the study of qualitative and quantitative features of the numerical solution on intervals that are far beyond the applicability of classical convergence theory, backward error analysis is the appropriate tool. It turns out that a symmetric method has to be \(s\)-stable (i.e., apart from the double root at \(1\ ,\) all zeros of \(\rho (\zeta )\) are simple and of modulus one). It has been shown in Hairer & Lubich (2004) that symmetric, \(s\)-stable multistep methods (17) of order \(r\) have a long-time behavior that it very similar to that of symplectic one-step methods. In particular, their numerical solutions satisfy:
The efficient implementation of linear multistep methods is a nontrivial task. A code has to be able to automatically select the step size (which can be done either by formulas with grid-dependent coefficients or by using a Nordsieck vector formulation) and to choose an optimal order. Among the available software (most of it can be downloaded from http://www.netlib.org/ode/) let us mention the following:
Internal references