Bayesian numerical analysis

The comparison of algorithms and the analysis of numerical problems in a Bayesian setting, cf. also Bayesian approach.

A numerical algorithm is usually developed and applied for inputs sharing some common properties. In a theoretical comparison of algorithms one either selects a class $P$ of inputs and defines the cost and the error of an algorithm in a worst-case sense over the class $P$. Alternatively, in Bayesian numerical analysis, one puts an a priori distribution $\mu$ on the inputs and defines the cost and the error of an algorithm in an average-case sense, i.e., via expectations with respect to $\mu$. For instance, if $\epsilon ( p , m )$ denotes the error of a method $m$, applied to an input $p$, then

\begin{equation*} \mathcal{E} _ { \operatorname{wor} } ( P , m ) = \operatorname { sup } _ { p \in P } | \epsilon ( p , m ) | \end{equation*}

is called the worst-case (maximal) error of $m$ on $P$ while

\begin{equation*} \mathcal{E} _ { \text{avg} } ( \mu , m ) = \int | \epsilon ( p , m ) | d \mu ( p ) \end{equation*}

is called the average error of $m$ with respect to $\mu$. The notions of optimality of algorithms and complexity of numerical problems are defined analogously in the worst-case setting and the Bayesian setting, see Optimization of computational algorithms and Information-based complexity.

Properties of the inputs, e.g., smoothness of integrands for numerical integration, are expressed by being a member of $P$ or being distributed according to $\mu$. For problems with inputs from finite-dimensional spaces, the Lebesgue measure is often used in the definition of $\mu$, see [a1] for linear programming and [a7] for solving systems of polynomial equations. For problems with inputs from infinite-dimensional spaces, often a Gaussian measure $\mu$ is used. If the inputs are real-valued functions on some domain, imposing an a priori distribution is essentially equivalent to treating inputs as random functions (cf. also Random function).

The Bayesian approach leads to new algorithms and to new insight into numerical problems. A few examples of this are as follows.

Multivariate integration.[edit]

Under very mild assumptions on the measure $\mu$, the cost of optimal methods does not depend exponentially on the number of variables, see [a8]. On the other hand, often a curse of dimension occurs in the worst-case setting.

Linear programming.[edit]

Under natural symmetry properties of $\mu$, the average cost of the simplex method is polynomial, while the worst-case (maximal) cost is exponential, see [a1].

Global optimization.[edit]

Suitable adaptive (active) methods are superior to all non-adaptive (passive) ones in a Bayesian setting, see [a2]. In contrast, there is no superiority of adaptive methods in the worst-case setting for any convex and symmetric class $P$. See also Optimization of computational algorithms; Adaptive quadrature; see [a6] for similar results.

Zero finding.[edit]

Bisection is worst-case optimal for many classes $P$ of (smooth) functions. Thus, the worst-case -complexity is of order $\operatorname { log } ( 1 / \epsilon )$. For measures that are derived from the Wiener measure and concentrated on smooth functions, the average-case -complexity is of order $\operatorname { log } \operatorname { log } ( 1 / \epsilon )$. The upper bound is achieved by a hybrid secant-bisection method, see [a5].

For references, see also Information-based complexity.

References[edit]

[a1]	K.H. Borgwardt, "The simplex method: A probabilistic analysis" , Springer (1987) Zbl 0604.90092
[a2]	J.M. Calvin, "Average performance of a class of adaptive algorithms for global optimization" Ann. Appl. Probab. , 7 (1997) pp. 711–730
[a3]	P. Diaconis, "Bayesian numerical analysis" S.S. Gupta (ed.) J.O. Berger (ed.) , Statistical Decision Theory and Related Topics IV , 1 , Springer (1988) pp. 163–175
[a4]	J.B. Kadane, G.W. Wasilkowski, "Average case -complexity in computer science: a Bayesian view" J.M. Bernardo (ed.) , Bayesian Statistics , North-Holland (1988) pp. 361–374
[a5]	E. Novak, K. Ritter, H. Woźniakowski, "Average case optimality of a hybrid secant-bisection method" Math. Comp. , 64 (1995) pp. 1517–1539
[a6]	K. Ritter, "Average case analysis of numerical problems" , Lecture Notes Math. , Springer (2000)
[a7]	M. Shub, S. Smale, "Complexity of Bezout's theorem V: polynomial time" Theoret. Comput. Sci. , 133 (1994) pp. 141–164
[a8]	G.W. Wasilkowski, "Average case complexity of multivariate integration and function approximation: an overview" J. Complexity , 12 (1996) pp. 257–272