A Boundary value problem is a system of ordinary differential equations with solution and derivative values specified at more than one point. Most commonly, the solution and derivatives are specified at just two points (the boundaries) defining a two-point boundary value problem.
Contents |
A two-point boundary value problem (BVP) of total order \(n\) on a finite interval \([a,b]\) may be written as an explicit first order system of ordinary differential equations (ODEs) with boundary values evaluated at two points as \[\tag{1} y'(x)=f(x, y(x)), \,\, x\in(a,b), \quad g(y(a),y(b))=0\]
Here, \(y,f,g \in R^n\) and the system is called explicit because the derivative \(y^\prime\) appears explicitly. The \(n\) boundary conditions defined by \(g\) must be independent; that is, they cannot be expressed in terms of each other (if \(g\) is linear the boundary conditions must be linearly independent).
In practice, most BVPs do not arise directly in the form (1) but instead as a combination of equations defining various orders of derivatives of the variables which sum to \(n\ .\) In an explicit BVP system, the boundary conditions and the right hand sides of the ordinary differential equations (ODEs) can involve the derivatives of each solution variable up to an order one less than the highest derivative of that variable appearing on the left hand side of the ODE defining the variable. To write a general system of ODEs of different orders in the form (1), we can define \(y\) as a vector made up of all the solution variables and their derivatives up to one less than the highest derivative of each variable, then add trivial ODEs to define these derivatives. See the section on initial value problems for an example of how this is achieved. See also Ascher et al.(1995) who show techniques for rewriting boundary value problems of various orders as first order systems. Such rewritten systems may not be unique and do not necessarily provide the most efficient approach for computational solution.
The words two-point refer to the fact that the boundary condition function \(g\) is evaluated at the solution at the two interval endpoints \(a\) and \(b\) unlike for initial value problems (IVPs) where the \(n\) initial conditions are all evaluated at a single point. Occasionally, problems arise where the function \(g\) is also evaluated at the solution at other points in \((a,b)\ .\) In these cases, we have a multipoint BVP. As shown in Ascher et al. (1995), a multipoint problem may be converted to a two-point problem by defining separate sets of variables for each subinterval between the points and adding boundary conditions which ensure continuity of the variables across the whole interval. Like rewriting the original BVP in the compact form (1), rewriting a multipoint problem as a two-point problem may not lead to a problem with the most efficient computational solution.
Most practically arising two-point BVPs have separated boundary conditions where the function \(g\) may be split into two parts (one for each endpoint): \[g_a(y(a))=0,\quad g_b(y(b))=0.\] Here, \(g_a\in R^s\) and \(g_b\in R^{n-s}\) for some value \(s\) with \(1<s<n\) and where each of the vector functions \(g_a\) and \(g_b\) are independent. However, there are well-known, commonly arising, boundary conditions which are not separated; for example, consider periodic boundary conditions which, for a problem written in the form of equation (1), are \[y(a)-y(b)=0.\]
Questions of existence and uniqueness for BVPs are much more difficult than for IVPs. Indeed, there is no general theory. However, there is a vast literature on individual cases; see Bernfeld and Lakshmikantham (1974) for a survey of a variety of techniques that may be used. Consider the IVP \[\tag{2} y'(x)=f(x, y(x)), \,\, y(a)=s \]
corresponding to the ODE in (1). If this IVP has a solution for all choices of initial vectors \(s\) then the existence of a solution to (1) hinges on the solvability of the nonlinear system of equations \[\tag{3} g(s, y(b;s))=0\]
where \(y(b;s)\) is the solution of the IVP (2) evaluated at \(x=b\) for the initial value \(y(a)=s\ .\) If there is a solution then it is the unique solution (among solutions of this type) if the nonlinear system \(g(s, y(b;s))=0\) has just one solution \(s\ .\)
For linear BVPs, where the ODEs and boundary conditions are both linear, the equation \(g(s, y(b;s))=0\) is a linear system of algebraic equations. Hence, generally there will be none, one or an infinite number of solutions, analogously to the situation with systems of linear algebraic equations.
In addition to the possibilities for linear problems, nonlinear problems can also have a finite number of solutions. Consider the following simple model of the motion of a projectile with air resistance: \[\tag{4} \begin{array}{rcl} y^\prime&=&\tan(\phi),\\ v^\prime&=&-\frac{g}{v}\tan(\phi) - \nu v\sec(\phi),\\ \phi^\prime&=&-\frac{g}{v^2}. \end{array} \]
These equations may be viewed as describing the planar motion of a projectile fired from a cannon. Here, \(y\) is the height of the projectile above the level of the cannon, \(v\) is the velocity of the projectile, and \(\phi\) is the angle of the trajectory of the projectile with the horizontal. The independent variable \(x\) measures the horizontal distance from the cannon. The constant \(\nu\) represents air resistance (friction) and \(g\) is the appropriately scaled gravitational constant. This model neglects three–dimensional effects such as cross winds and the rotation of the projectile. The initial height is \(y(0)=0\) and the muzzle velocity \(v(0)\) for the cannon is fixed. The standard projectile problem is to choose the initial angle of the cannon and hence of the projectile, \(\phi(0)\ ,\) so that the projectile will hit a target at the same height as the cannon at a distance \(x=x_{end}\ ;\) that is, we require \(y(x_{end})=0\ .\) Altogether the boundary conditions are \[ y(0)=y(x_{end})=0, \quad v(0)\,\, {\rm given.} \] Does this BVP have a solution? Physical intuition suggests that it certainly does not for \(x_{end}\) beyond the range of the cannon for the fixed muzzle velocity \(v(0)\ .\) On the other hand, if \(x_{end}\) is small enough, we do expect a solution, but is there only one? To see that there is not, consider the case when the target is very close to the cannon. We can hit the target by shooting with an almost flat trajectory or by shooting high and dropping the projectile mortar-like on the target. That is, there are (at least) two solutions that correspond to initial angles \(\phi(0)= \phi_{low} > 0\) and \(\phi(0)= \phi_{high} < \pi/2\ .\) It turns out that there are exactly two solutions; see for an example.
Now, let \(x_{end}\) increase. There are still two solutions, but the larger the value of \(x_{end}\ ,\) the smaller the angle \(\phi_{high}\) and the larger the angle \(\phi_{low}\ .\) If we keep increasing \(x_{end}\ ,\) eventually we reach the maximum range with the given muzzle velocity. At this distance there is just one solution;, that is, \(\phi_{low} = \phi_{high}\ .\) In summary, there is a critical value of \(x_{end}\) for which there is exactly one solution. If \(x_{end}\) is smaller than this critical value, there are exactly two solutions and if it is larger, there is no solution.
The approach to proving existence exemplified by the projectile model suggests a computational method of solution. This is to compute the unknown initial value \(\phi(0)\) to satisfy the nonlinear equation \(y(x_{end};\phi(0))=0\ .\) This approach requires the (computational) solution of an IVP for the ODEs for each value of the angle \(\phi(0)\) attempted. Also, the nonlinear equation may be solved by any suitable method. Since there are quality codes for both tasks this suggests an approach that can be useful in practice. Physical intuition suggests exploiting the relationship between the angle chosen and the range achieved in a bisection-like algorithm but, in more complex cases, such simple physical relationships are usually not available and a general purpose method such as a Newton iteration is often used. The shooting method can be very successful on simple problems such as the projectile problem. It can be extended easily to suggest a method of solution for almost any boundary value problem based on solving equation (3) and it has been automated in many pieces of mathematical software. However, its success depends on a number of factors the most important of which is the stability of the initial value problem that must be solved at each iteration. (An ODE problem is stable if a small change to the ODE and/or the the initial or boundary conditions leads to a small change in the solution.) Unfortunately, it is the case that for many stable boundary value problems the corresponding initial value problems (beginning from either endpoint and integrating towards the other endpoint) are insufficiently stable for shooting to succeed. So, shooting methods are not computationally suitable for the whole range of practical boundary value problems, particularly those on very long or infinite intervals. A second difficulty, sometimes interconnected with the aforementioned stability problem, is that methods such as Newton iteration for solving equation (3) may require a far more accurate initial estimate for the initial value \(s\) than is readily available.
Many ODE BVPs arise from the analysis of partial differential equations through the computation of similarity solutions or via perturbation methods. These problems are often defined on semi-infinite ranges. For example, the Blasius problem \[\tag{5} f'''=\frac{f^{\prime\prime}f}{2}, \,\,\,f(0)=f'(0)=0, \, f'(\infty)=1 \]
arises from a similarity solution of partial differential equations describing fluid flow over a flat plate. Of course, the boundary condition at infinity is asymptotic. It should be read as \(f'(x)\rightarrow 1\) as \(x\rightarrow\infty\ ,\) and it implies that \(f(x)\sim x+C\) as \(x\rightarrow\infty\) where the constant \(C\) is a priori unknown.
This problem is easy to solve computationally — shooting from the origin and using a standard nonlinear equation solver works without difficulty. Of course, we can't integrate the equations to infinity but we can replace the boundary condition at infinity by a corresponding one at a finite point, \(L\ ,\) and that point \(L\) need not be chosen very large because the asymptotic expansion of the solution has \(f(x)\sim x+C\) exponentially fast as \(x\rightarrow\infty\ .\) So, for example, using the boundary condition \(f'(L)=1\) with \(L=10\) provides a quite accurate solution. There are no fast increasing solutions to the equation near the desired solution so there is no unstable growth of computational solutions on quite long ranges of integration as long as the guess for the unknown initial value \(f''(0)\) is not chosen too far away from the correct value.
In the Blasius problem, the location and type of boundary conditions are determined physically and give us a stable (well-conditioned) problem. In general, matters are more complicated though physical principles remain an essential guide. For simplicity of exposition (and understanding) consider the linear problem \[\tag{6} y^{\prime\prime\prime} + 2y^{\prime\prime}-y^\prime -2y=0. \]
Its general solution is \[y(x)=Ae^{x}+Be^{-x}+Ce^{-2x}.\] Note that there are three components of the solution, two that decay as \(x\) increases from the origin towards infinity and one that grows. Suppose that we solve this equation on the interval \([0,\infty)\) with boundary conditions \[y(0)=1, \; y^\prime(0) =1, \; y(\infty)= 0. \] The last boundary condition implies that \(A = 0\ .\) Then, the other boundary conditions imply that \(B=3\) and \(C=-2\ .\) So, there is a unique solution of this BVP. On the other hand, if the boundary conditions are \[\tag{7} y(0)=1, \; y(\infty) = 0, \; y^\prime(\infty)= 0,\]
the boundary condition \(y(\infty) = 0\) again implies that \(A=0\ ,\) but now the third condition places no constraint on the coefficients, and the remaining condition tells us only that \(C = 1 - B\ ,\) so any value of \(B\) results in a solution; that is, this BVP has infinitely many solutions. This problem provides an example of the requirements of exponential dichotomy; Ascher et al. (1995) and Mattheij and Molenaar (2002) discuss these requirements in detail. For a problem to be well-posed the boundary conditions must be set appropriately. For the simple equation (6), if the boundary conditions are separated, essentially we must have two boundary conditions at the origin and one at infinity matching the two decaying and one increasing (towards infinity) basis functions in the solution.
If a BVP with boundary conditions at infinity is not well–posed, it is natural to expect numerical difficulties when those boundary conditions are imposed at a large but finite point \(L\) even though, in this case, a solution may always be defined. Suppose then that we solve the equation (6) with boundary conditions \[ y(0)=1, \; y(L) = 0, \; y^\prime(L)= 0\] replacing (7). For large values of \(L\ ,\) the system of linear equations for the coefficients \(A\ ,\) \(B\ ,\) and \(C\) in the general solution is extremely ill–conditioned reflecting the poor stability (conditioning) of equation (6) with boundary conditions (7); see Shampine et al. (2003) for more details.
We described shooting methods above and we explained there that there are inherent problems in this approach. These problems may be overcome, at least partially, using variants on the shooting method which broadly come under the heading of multiple shooting; see Ascher and Petzold (1998).
Most general purpose software packages for BVPs are based on global methods which fall into two related categories. The first is finite differences where a mesh is defined on the interval \([a,b]\) and the derivative in (1) is replaced by a difference approximation at each mesh point; see Ascher et al. (1995) and Keller (1992). The resulting difference equations plus the boundary conditions give a set of algebraic equations for the solution on the mesh. These equations are generally nonlinear but are linear when the differential equations and boundary conditions are both linear. To achieve a user-specified error the software generally adjusts the mesh placement using local error estimates based on higher order differencing involving techniques such as deferred correction; see Ascher and Petzold (1998) and Shampine et al. (2003).
A second global approach is to approximate the solution defined in terms of a basis for a linear space of functions usually defined piecewise on a mesh and to collocate this approximate solution. (In collocation we substitute the approximate solution in the system of ODEs then require the ODE system to be satisfied exactly at each collocation point. The number of collocation points plus the number of boundary conditions must equal the number of unknown coefficients in the approximate solution; that is, they must equal the dimension of the linear space.) The most common choice of approximation is a linear space of splines. For a given linear space, the collocation points must be placed judiciously to achieve optimal accuracy. The error is again controlled by adjusting the mesh spacing using local error estimates involving approximate solutions of varying orders of accuracy; see Ascher et al. (1995), Ascher and Petzold (1998) and Mattheij and Molenaar (2002).
Choosing a spline basis for collocation (or more or less equivalently using certain types of Runge-Kutta formulas on the mesh) leads to a nonlinear system which must be solved iteratively. At each iteration we must solve a structured linear system of equations. When the boundary conditions are separated, the system is almost block diagonal. Similarly structured systems arise from finite difference approximations and also from multiple shooting techniques. Because of the great practical importance of this type of linear algebra problem, significant effort has been devoted to developing stable algorithms which minimize storage and maximize efficiency; see Amodio et al. (2000). The case of nonseparated boundary conditions leads to a similarly structured system whose solution poses potentially greater stability difficulties.
Another type of BVP that arises in the analytical solution of certain linear partial differential equations is the Sturm–Liouville eigenproblem. In its simplest form this is a scalar self-adjoint linear second order ODE BVP \[\tag{8} -(p(x)y^\prime(x))^\prime+q(x)y(x)=\lambda r(x)y(x),\quad x\in(a,b), \quad y(a)=y(b)=0.\]
Here, the parameter \(\lambda\ ,\) an eigenvalue, is to be determined such that the BVP (8) has a nontrivial (not identically zero) solution. There are broad analogies here with the generalized algebraic eigenproblem \(Ax=\lambda Bx\) where, depending on the the properties of the matrices \(A\) and \(B\ ,\) various distributions of the finite number of eigenvalues \(\lambda\) are possible. In the case of the BVP (8), for simple cases there are a countable number of number of eigenvalues each with a corresponding solution \(y(x)\) (an eigenfunction). So, for example, as shown in Zettl (2005), if \(p(x), q(x)\) and \(r(x)\) are sufficiently smooth and \(p(x),\, r(x)>0\) on \([a,b]\) then the eigenvalues are real and distinct, and may be ordered \(0<\lambda_0<\lambda_1<\lambda_2<\ldots\) defining a discrete spectrum. The eigenfunction \(y_n(x)\) corresponding to \(\lambda_n\) has \(n\) zeros in \((a,b)\) and the set of eigenfunctions \(\{y_i(x)\}_{i=0}^\infty\) is linearly independent. If we relax the smoothness conditions on the coefficients \(p,q\) and \(r\ ,\) and/or permit these functions to take on a wider range of values, many different phenomena are observed from doubling of the eigenvalues to the occurrence of continuous spectra; see Zettl (2005) for details.
ODE eigenvalue problems can be solved using a general-purpose code shooting code that treats an eigenvalue as an unknown parameter. However, with such a code one can only hope to compute an eigenvalue close to a guess. Specialized codes are much more efficient and allow you to be sure of computing a specific eigenvalue; see Pryce (1993) for a survey. Numerical methods for Sturm–Liouville eigenproblems that have been implemented in software include finite difference and finite element discretizations which each lead to generalized algebraic eigenproblems where approximations to a number of the lower eigenvalues are available simultaneously. Other methods popularized by Pruess approximate the ODE eigenproblem by another where the coefficients \(p, q\) and \(r\) are replaced by piecewise constants; this results in a set of problems which may each be solved analytically, again producing approximations to a number of the lower eigenvalues. Finally, shooting methods are usually implemented using a scaled Prufer transformation, \(pu^\prime=\sqrt{S}r\cos(\theta), \,\, u=\frac{r\sin(\theta)}{\sqrt{S}}\) where \(S\) is a scaling function, see Pryce (1993); \(S=1\) gives the standard Prufer transformation. The transformation leads to a pair of nonlinear ODEs for \(r\) and \(\theta\) where the ODE for \(\theta\) does not depend on \(r\) so may be solved alone. More directly important, the boundary conditions in problem (8) are replaced by \(\theta(a,\lambda_k)=0,\,\, \theta(b,\lambda_k)=k\pi\) which provide the basis for a shooting method where each eigenvalue may be determined by the solution of a single nonlinear algebraic equation.
Internal references
H.B. Keller, Numerical Methods for Two-Point Boundary-Value Problems, Dover, New York, NY, 1992.