Statistical problems in the theory of stochastic processes

A branch of mathematical statistics devoted to statistical inferences on the basis of observations represented as a random process. In the most common formulation, the values of a random function $ x( t) $ for $ t \in T $ are observed, and on the basis of these observations statistical inferences must be made regarding certain characteristics of $ x( t) $. Such a broad definition also formally includes all the classical statistics of independent observations. In fact, statistics of random processes is often taken to mean only the statistics of dependent observations, excluding the statistical analysis of a large number of independent realizations of a random process. Here the foundations of the statistical theory, the basic formulations of problems (cf. Statistical estimation; Statistical hypotheses, verification of), the basic concepts (sufficiency, unbiasedness, consistency, etc.) are the same as in the classical theory. However, when solving concrete problems, significant difficulties and phenomena of a new type arise from time to time. These difficulties are partially caused by the fact of dependence and the more complex structure of the process under consideration, and partially, in the case of observations in continuous time, by the need to examine distributions in infinite-dimensional spaces.

In fact, when solving statistical problems in the theory of random processes, the structure of the process under consideration is crucial, while when classifying random processes, statistical problems of Gaussian, Markov, stationary, branching, diffusion, and other processes are studied. Of these, the most far-reaching is the statistical theory of stationary processes (time-series analysis).

The need for a statistical analysis of random processes arose in the 19th century in the form of analysis of meteorological and economic series and studies on cyclic processes (price fluctuation, sun spots). In modern times, the number of problems covered by the statistical analysis of random processes has become extraordinarily large. To cite but a few examples, statistical analysis of random noise, vibrations, turbulence, wave motion in a sea, cardiograms and encephalograms, etc. The theoretical aspects of extracting a signal from a background of noise can be seen to a significant degree as a statistical problem in the theory of random processes.

Below it is proposed that a segment $ x( t) $, $ 0 \leq t \leq T $, of the random process $ x( t) $ be observed, whereby the parameter $ t $ passes either through the whole interval $ [ 0, T] $, or through the integers in this interval. In statistical problems, the distribution $ P ^ {T} $ of the process $ \{ {x( t) } : {0 \leq t \leq T } \} $ is usually known only to belong to some family of distributions $ \{ P ^ {T} \} $. This family can always be written in parametric form.

Example 1.[edit]

The process $ x( t) $ is either the sum of a non-random function $ s( t) $( a "signal" ) and a random function $ \xi ( t) $( the "noise" ), or is a single random function $ \xi ( t) $. The hypothesis $ H _ {0} $: $ x( t) = s( t) + \xi ( t) $ must be tested against the alternative $ H _ {1} $: $ x( t) = \xi ( t) $( the problem of locating a signal in noise). This is an example of testing a statistical hypothesis.

Example 2.[edit]

The process $ x( t) = s( t) + \xi ( t) $, where $ s( t) $ is an unknown non-random function (the signal), while $ \xi ( t) $ is a random process (the noise). The function $ s $, or its value $ s( t _ {0} ) $ at a given point $ t _ {0} $, has to be estimated. Similarly, it can be proposed that $ x( t) = s( t; \theta ) + \xi ( t) $, where $ s $ is a known function, depending on an unknown parameter $ \theta $, which must also be estimated through the observation of $ x( t) $( problems of extracting a signal from a background of noise). These are examples of estimation problems.

The likelihood ratio for random processes.[edit]

In statistical problems, likelihood ratios and likelihood functions play an important role (see Neyman–Pearson lemma; Statistical hypotheses, verification of; Statistical estimation). The likelihood ratio of two distributions $ P _ {u} ^ {T} $ and $ P _ {v} ^ {T} $ is the density

$$ p( x( \cdot ); u , v) = p( x( \cdot )) = \ \frac{dP _ {u} ^ {T} }{dP _ {v} ^ {T} } ( x( \cdot )) . $$

The likelihood function is the function

$$ L( \theta ) = \frac{dP _ \theta ^ {T} }{d \mu } ( x( \cdot )), $$

where $ \mu $ is a $ \sigma $- finite measure relative to which all measures $ P _ \theta ^ {T} $ are absolutely continuous. In the discrete case, where $ t $ runs through the integers of $ [ 0, T] $ and $ T < \infty $, the likelihood ratio always exists if the distributions $ P _ {u} $ and $ P _ {v} $ have positive densities, and it coincides with the ratio of these two densities.

If $ t $ runs through the entire interval $ [ 0, T] $, then cases may arise in which the measures $ P _ {u} ^ {T} $ and $ P _ {v} ^ {T} $ are not absolutely continuous with respect to each other; moreover, situations can arise in which $ P _ {u} ^ {T} $ and $ P _ {v} ^ {T} $ are mutually singular, i.e. where for a set $ A $ in the space of realizations of $ x( t) $,

$$ P _ {u} ^ {T} \{ x \in A \} = 0,\ \ P _ {v} ^ {T} \{ x \in A \} = 1. $$

In this case $ p( x; u , v) $ does not exist. The singularity of the measures $ P _ \theta ^ {T} $ leads to important and somewhat paradoxical statistical results, allowing for error-free inferences concerning the parameter $ \theta $. For example, let $ \theta = \{ 0, 1 \} $; the singularity of the measures $ P _ {0} ^ {T} $ and $ P _ {1} ^ {T} $ means that, using the test "accept H0 if x A, reject H0 if x A" , the hypotheses $ H _ {0} $: $ \theta = 0 $ and $ H _ {1} $: $ \theta = 1 $ are distinguished error-free. The presence of such perfect tests often demonstrates that the statistical problem is not posed entirely satisfactorily and that certain essential random disturbances are excluded from it.

Example 3.[edit]

Let $ x( t) = \theta + \xi ( t) $, where $ \xi ( t) $ is a stationary ergodic process with zero average and $ \theta $ is a real parameter. Let the realizations of $ \xi ( t) $ with probability 1 be analytic in a strip containing the real axis. According to the ergodic theorem,

$$ \lim\limits _ {T \rightarrow \infty } \frac{1}{T} \int\limits _ { 0 } ^ { T } x( t) dt = \theta , $$

and all measures $ P _ \theta ^ \infty $ are also mutually singular. Since an analytic function $ x( t) $ is completely defined by its values in a neighbourhood of zero, the parameter $ \theta $ is error-free when estimated through the observations $ \{ {x( t) } : {0 \leq t \leq T } \} $ for any $ T > 0 $.

The calculation of the likelihood ratio in those cases where it exists is a difficult problem. Calculations are often based on the limit relation

$$ p( x( \cdot ); u , v) = \ \lim\limits _ {n \rightarrow \infty } \ \frac{p _ {u} ( x( t _ {1} ) \dots x( t _ {n} )) }{p _ {v} ( x( t _ {1} ) \dots x( t _ {n} )) } , $$

where $ p _ {u} , p _ {v} $ are the densities of the vector $ ( x( t _ {1} ) \dots x( t _ {n} )) $, while $ \{ t _ {1} , t _ {2} , . . . \} $ is a dense set in $ [ 0, T] $. Study of the right-hand side of the above equality also is useful in investigating the possible singularity of $ P _ {u} $ and $ P _ {v} $.

Example 4.[edit]

Suppose one has either the observation $ x( t) = w( t) $, where $ w( t) $ is a Wiener process (hypothesis $ H _ {0} $), or $ x( t) = m( t) + w( t) $, where $ m $ is a non-random function (hypothesis $ H _ {1} $). The measures $ P _ {0} , P _ {1} $ are mutually absolutely continuous if $ m ^ \prime \in L _ {2} ( 0, T) $, and mutually singular if $ m ^ \prime \notin L _ {2} ( 0, T) $. The likelihood ratio equals

$$ \frac{dP _ {1} ^ {T} }{dP _ {0} ^ {T} } ( x) = \ \mathop{\rm exp} \left \{ - \frac{1}{2} \int\limits _ { 0 } ^ { T } [ m ^ \prime ( t)] ^ {2} dt + \int\limits _ { 0 } ^ { T } m ^ \prime ( t) dx( t) \right \} . $$

Example 5.[edit]

Let $ x( t) = \theta + \xi ( t) $, where $ \theta $ is a real parameter and $ \xi ( t) $ is a stationary Gaussian Markov process with mean zero and known correlation function $ r( t) = e ^ {- \alpha | t | } $, $ \alpha > 0 $. The measures $ P _ \theta ^ {T} $ are mutually absolutely continuous with likelihood function

$$ \frac{dP _ \theta ^ {T} }{dP _ \theta ^ {0} } ( x) = \ \mathop{\rm exp} \left \{ \frac{1}{2} \theta x( 0) + \frac{1}{2} \theta x( T) \right . + $$

$$ + \left . \frac{1}{2} \theta \alpha \int\limits _ { 0 } ^ { T } x( t) dt - \frac{1}{2} \theta ^ {2} - \frac{1}{4} \theta ^ {2} \alpha T \right \} . $$

In particular, $ x( 0) + x( T) + \alpha \int _ {0} ^ {T} x( t) dt $ is a sufficient statistic for the family $ P _ \theta ^ {T} $.

Linear problems in the statistics of random processes.[edit]

Let the function

$$ \tag{* } x( t) = \sum _ { 1 } ^ { k } \theta _ {j} \phi _ {j} ( t) + \xi ( t) $$

be observed, where $ \xi ( t) $ is a random process with mean zero and known correlation function $ r( t, s) $, $ \phi _ {j} $ are known non-random functions, $ \theta = ( \theta _ {1} \dots \theta _ {k} ) $ is an unknown parameter ( $ \theta _ {j} $ are the regression coefficients), and the parameter set $ \Theta $ is a subset of $ \mathbf R ^ {k} $. Linear estimators for $ \theta _ {j} $ are estimators of the form $ \sum c _ {j} x( t _ {j} ) $, or their limits in the mean square. The problem of finding optimal unbiased linear estimators in the mean square reduces to the solution of linear algebraic or linear integral equations in $ r $. Indeed, an optimal estimator $ \widehat \theta $ is defined by the equations $ {\mathsf E} _ \theta ( \widehat \theta _ {j} \xi ) = 0 $ for any $ \xi $ of the form $ \xi = \sum b _ {j} x( t _ {j} ) $, $ \sum b _ {j} \phi _ {l} ( t _ {j} ) = 0 $. In a number of cases, estimators of $ \theta $, obtained asymptotically by the method of least squares, when $ T \rightarrow \infty $, are not worse than the optimal linear estimators. Estimators obtained by the method of least squares are calculated more simply and do not depend on $ r $.

Example 6.[edit]

Under the conditions of example 5, $ k= 1 $, $ \phi _ {1} ( t) \equiv 1 $. The optimal unbiased linear estimator takes the form

$$ \widehat \theta = \frac{1}{2 + \alpha T } \left ( x( 0) + x( T) + \alpha \int\limits _ { 0 } ^ { T } x( t) dt \right ) . $$

The estimator

$$ \theta ^ \star = \frac{1}{T} \int\limits _ { 0 } ^ { T } x( t) dt $$

has asymptotically the same variance.

Statistical problems of Gaussian processes.[edit]

Let $ \{ {x( t) } : {0 \leq t \leq T, P _ \theta ^ {T} } \} $ be a Gaussian process for all $ \theta \in \Theta $. For Gaussian processes one has the alternatives: Any two measures $ P _ {u} ^ {T} , P _ {v} ^ {T} $ are either mutually absolutely continuous or are singular. Since the Gaussian distribution $ P _ \theta ^ {T} $ is completely defined by the mean value $ m _ \theta ( t) = {\mathsf E} _ \theta x( t) $ and the correlation function $ r _ \theta ( s, t) = {\mathsf E} _ \theta x( s) x( t) $, the likelihood ratio $ dP _ {u} ^ {T} /dP _ {v} ^ {T} $ is expressed in terms of $ m _ {u} $, $ m _ {v} $, $ r _ {u} $, $ r _ {v} $ in a complex way. The case where $ r _ {u} = r _ {v} = r $, and $ r $ a continuous function, is relatively simple. Let $ \Theta = \{ 0, 1 \} $, $ r _ {0} = r _ {1} = r $; let $ \lambda _ {i} $, and $ \phi _ {i} ( t) $, be the eigenvalues, and the corresponding normalized eigenfunctions in $ L _ {2} ( 0, T) $, of the integral equation

$$ \lambda \phi ( s) = \int\limits _ { 0 } ^ { T } r ( t) \phi ( t) dt; $$

let the means $ m _ {0} ( t) $ and $ m _ {1} ( t) $ be continuous functions; and let

$$ m _ {ij} = \int\limits _ { 0 } ^ { T } m _ {i} ( t) \phi _ {j} ( t) dt. $$

The measures $ P _ {0} , P _ {1} $ are absolutely continuous if and only if

$$ \sum _ { j= } 1 ^ \infty ( m _ {0j} - m _ {1j} ) ^ {2} \lambda _ {j} ^ {-} 1 < \infty . $$

Here,

$$ \frac{dP _ {1} ^ {T} }{dP _ {0} ^ {T} } ( x) = \ \mathop{\rm exp} \left \{ \sum _ { j= } 1 ^ \infty \frac{m _ {1j} - m _ {0j} }{\lambda _ {j} } \right . \times $$

$$ \times \left . \left ( \int\limits _ { 0 } ^ { T } x( t) \phi _ {j} ( t) dt - \frac{m _ {1j} - m _ {0j} }{2} \right ) \right \} . $$

This equality can be used to devise a test for the hypothesis $ H _ {0} $: $ m = m _ {0} $ against the alternative $ H _ {1} $: $ m = m _ {1} $ under the assumption that the function $ r $ is known to the observer.

Statistical problems of stationary processes.[edit]

Let the observation $ x( t) $ be a stationary process with mean $ m $ and correlation function $ r( t) $; let $ f( \lambda ) $ and $ F( \lambda ) $ be its spectral density and spectral function, respectively. The basic problems of the statistics of stationary processes relate to hypotheses testing and to estimating the characteristics $ m $, $ r $, $ f $, $ F $. In the case of an ergodic process $ x( t) $, consistent estimators (when $ T \rightarrow \infty $) for $ m $ and $ r( t) $, respectively, are provided by

$$ m ^ \star = \frac{1}{T} \int\limits _ { 0 } ^ { T } x( t) dt, $$

$$ r ^ \star ( t) = \frac{1}{T} \int\limits _ { 0 } ^ { T- } t x( t+ s) x( s) ds. $$

The problem of estimating $ m $ when $ r $ is known is often treated as a linear problem. This group of problems also includes the more general problems of estimating regression coefficients through observations of the form (*) with stationary $ \xi ( t) $.

Let $ x( t) $ have zero mean and spectral density $ f( \lambda ; \theta ) $ depending on a finite-dimensional parameter $ \theta \in \Theta $. If the process $ x( t) $ is Gaussian, formulas can be derived for the likelihood ratio $ dP _ \theta /dP _ {\theta ^ {0} } $( if the ratio exists), which in a number of cases make it possible to find maximum-likelihood estimators or "good" approximations of them (for large $ T $). Under sufficiently broad assumptions these estimators are asymptotically normal $ ( \theta , c( \theta )/ \sqrt T ) $ and asymptotically efficient.

Example 7.[edit]

Let $ x( t) $ be a stationary Gaussian process in continuous time with rational spectral density $ f( \lambda ) = | Q( \lambda )/P( \lambda ) | ^ {2} $, where $ P $ and $ Q $ are polynomials. The measures $ P _ {0} ^ {T} , P _ {1} ^ {T} $ corresponding to the rational spectral densities $ f _ {0} , f _ {1} $ are absolutely continuous if and only if

$$ \lim\limits _ {\lambda \rightarrow \infty } \ \frac{f _ {0} ( \lambda ) }{f _ {1} ( \lambda ) } = 1. $$

Here the parameter $ \theta $ is the set of all coefficients of the polynomials $ P, Q $.

Example 8.[edit]

An important class of stationary Gaussian processes consists of the auto-regressive processes (cf. Auto-regressive process) $ x( t) $:

$$ x ^ {(} n) ( t) + \theta _ {n} x ^ {(} n- 1) ( t) + \dots + \theta _ {1} x( t) = \xi ( t), $$

where $ \xi ( t) $ is a Gaussian white noise of unit intensity and $ \theta = ( \theta _ {1} \dots \theta _ {n} ) $ is an unknown parameter. In this case the spectral density is

$$ f( \lambda ; \theta ) = ( 2 \pi ) ^ {-} 1 | P( i \lambda ) | ^ {-} 2 , $$

where

$$ P( z) = \theta _ {1} + \theta _ {2} z + \dots + \theta _ {n} z ^ {n-} 1 + z ^ {n} . $$

The likelihood function is

$$ \frac{dP _ \theta ^ {T} }{dP _ {\theta ^ {0} } ^ {T} } = \ \sqrt { \frac{K( \theta ) }{K( \theta ^ {0} ) } } \mathop{\rm exp} \left \{ \frac{( \theta _ {n} - \theta _ {n} ^ {0} ) }{T\right} {} - $$

$$ - \frac{1}{2} \sum _ { j= } 0 ^ { n- } 1 [ \lambda _ {j} ( \theta ) - \lambda _ {j} ( \theta ^ {0} )] \int\limits _ { 0 } ^ { T } [ x ^ {(} j) ( t)] ^ {2} dt - $$

$$ - \left . \frac{1}{2} ( \lambda ( \theta ) - \lambda ( \theta ^ {0} )) \right \} . $$

Here, $ \lambda _ {j} ( \theta ) $ and $ \lambda ( \theta ) $ are quadratic forms in $ \theta $, depending on the values $ x ^ {(} j) ( t) $, $ j = 1 \dots n- 1 $, at the points $ t = 0, T $, and $ K( \theta ) $ is the determinant of the correlation matrix of the vector $ ( x( 0) \dots x ^ {(} n- 1) ( 0)) $.

Maximum-likelihood estimators for the auto-regression parameter $ \theta $ are asymptotically normal and asymptotically efficient. These properties are shared by the solution $ \theta _ {T} ^ \star $ of the approximate likelihood equation

$$ \frac{1}{2T} \sum _ { j= } 0 ^ { n- } 1 \frac{d \lambda _ {j} ( \theta ) }{d \theta _ {i} } \int\limits _ { 0 } ^ { T } [ x ^ {(} j) ( t)] ^ {2} dt = \ \left \{ \begin{array}{ll} 0, & 1 \leq i \leq n, \\ \frac{1}{2} , & i= n. \\ \end{array} \right .$$

An important role in statistical studies on the spectrum of a stationary process is played by the periodogram $ I _ {T} ( \lambda ) $. This statistic is defined as

$$ I _ {T} ( \lambda ) = \ \frac{1}{2 \pi T } \left | \sum _ { t= } 0 ^ { T } e ^ {- it \lambda } x( t) \right | \ \textrm{ (discrete time) } , $$

$$ I _ {T} ( \lambda ) = \frac{1}{2 \pi T } \left | \int\limits _ { 0 } ^ { T } e ^ {- it \lambda } x( t) dt \right | ^ {2} \ \textrm{ (continuous time) } . $$

The periodogram is widely used in constructing different kinds of estimators for $ f( \lambda ) $, $ F( \lambda ) $ and criteria for testing hypotheses on these characteristics. Under broad assumptions, the statistics $ \int I _ {T} ( \lambda ) \phi ( \lambda ) d \lambda $ are consistent estimators for $ \int f( \lambda ) \phi ( \lambda ) d \lambda $. In particular, $ \int _ \alpha ^ \beta I _ {T} ( \lambda ) d \lambda $ may serve as an estimator for $ F( \beta ) - F( \alpha ) $. If the sequence $ \phi _ {T} ( \lambda ; \lambda _ {0} ) $ converges in an appropriate way to the $ \delta $- function $ \delta ( \lambda - \lambda _ {0} ) $, then the integrals $ \int \phi _ {T} ( \lambda ; \lambda _ {0} ) I _ {T} ( \lambda ) d \lambda $ will be consistent estimators for $ f( \lambda _ {0} ) $. Functions of the form $ a _ {T} \psi ( a _ {T} ( \lambda - \lambda _ {0} )) $, $ a _ {T} \rightarrow \infty $, are often used in the capacity of the functions $ \phi _ {T} ( \lambda ; \lambda _ {0} ) $. If $ x( t) $ is a process in discrete time, these estimators can be written in the form

$$ \frac{1}{2 \pi } \sum _ { t=- } T+ 1 ^ { T- } 1 e ^ {- it \lambda } r ^ \star ( t) c _ {T} ( t), $$

where the empirical correlation function is

$$ r ^ \star ( t) = \frac{1}{T} \sum _ { u= } 0 ^ { T- } t x( u+ t) x( u), $$

while the non-random coefficients $ c _ {T} ( t) $ are defined by the choice of $ \psi $ and $ a _ {T} $. This choice, in turn, depends on a priori information on $ f( \lambda ) $. A similar representation also holds for processes in continuous time.

Problems in the statistical analysis of stationary processes sometimes also include problems of extrapolation, interpolation and filtration of stationary processes.

Statistical problems of Markov processes.[edit]

Let the observations $ X _ {0} \dots X _ {T} $ belong to a homogeneous Markov chain. Under sufficiently broad assumptions the likelihood function is

$$ \frac{dP _ \theta ^ {T} }{d \mu ^ {T} } = \ p _ {0} ( X _ {0} ; \theta ) p( X _ {1} | X _ {0} ; \theta ) \dots p( X _ {T} | X _ {T-} 1 ; \theta ), $$

where $ p _ {0} $, $ p $ are the initial and transition densities of the distribution. This expression is similar to the likelihood function for a sequence of independent observations, and when the regularity conditions are observed (smoothness in $ \theta \in \Theta \subset \mathbf R ^ {k} $), a theory can be constructed for hypotheses testing and estimation which is analogous to the corresponding theory for independent observations.

A more complex situation arises if $ x( t) $ is a Markov process in continuous time. Let $ x( t) $ be a homogeneous Markov process with a finite number of states $ N $ and differentiable transition probabilities $ P _ {ij} ( t) $. The transition probability matrix is defined by the matrix $ Q = \| q _ {ij} \| $, $ q _ {ij} = P _ {ij} ^ { \prime } ( 0) $, $ q _ {i} = - q _ {ii} $. Let $ x( 0) = i _ {0} $ be independent of $ Q $ at the initial time. By choosing any matrix $ Q _ {0} = \| q _ {ij} ^ {0} \| $, one finds

$$ \frac{dP _ {Q} ^ {T} }{dP _ {Q _ {0} } ^ {T} } ( x) = \ \mathop{\rm exp} \{ ( q _ {i _ {n} } ^ {0} - q _ {i _ {n} } ) T \} \sum _ {\nu = 0 } ^ { n- } 1 \frac{q _ {j _ \nu j _ {\nu + 1 } } }{q _ {j _ \nu j _ {\nu + 1 } } ^ {0} } \times $$

$$ \times \mathop{\rm exp} \{ t _ \nu ( q _ {i _ {n} } - q _ {i _ \nu } - q _ {i _ \nu } ^ {0} + q _ {i _ {n} } ^ {0} ) \} . $$

Here the statistics $ n( x) $, $ t _ \nu ( x) $, $ j _ \nu ( x) $ are defined in the following way: $ n $ is the number of jumps of $ x( t) $ on the interval $ [ 0, T) $; $ \tau _ \nu $ is the moment of the $ \nu $- th jump, $ t _ \nu = \tau _ {\nu + 1 } - \tau _ \nu $, and $ j _ \nu = x( \tau _ \nu ) $. It follows that the maximum-likelihood estimators for the parameters $ q _ {ij} $ are: $ q _ {ij} ^ \star = m _ {ij} / \mu _ {i} $, where $ m _ {ij} $ is the number of transitions from $ i $ to $ j $ on $ [ 0, T) $, while $ \mu _ {i} $ is the time spent by the process $ x( t) $ in the state $ i $.

Example 9.[edit]

Let $ x( t) $ be a birth-and-death process with constant intensities of birth $ \lambda $ and death $ \mu $. This means that $ q _ {i,i+} 1 = i \lambda $, $ q _ {i,i-} 1 = i \mu $, $ q _ {ii} = 1- i( \lambda + \mu ) $, and $ q _ {ij} = 0 $ if $ | i- j | > 1 $. In this example the number of states is infinite. Let $ x( 0) \equiv 1 $. The likelihood ratio is

$$ \frac{dP _ {\lambda \mu } ^ {T} }{dP _ {\lambda _ {0} , \mu _ {0} } ^ {T} } ( x) = $$

$$ = \ \left ( \frac \lambda {\lambda _ {0} } \right ) ^ {B} \left ( \frac \mu {\mu _ {0} } \right ) ^ {D} \mathop{\rm exp} \left \{ -( \lambda + \mu - \lambda _ {0} - \mu _ {0} ) \int\limits _ { 0 } ^ { T } x( s) ds \right \} . $$

Here $ B $ is the total number of births (jumps of measure $ + 1 $) and $ D $ is the number of deaths (jumps of measure $ - 1 $). Maximum-likelihood estimators for $ \lambda $ and $ \mu $ are

$$ \lambda _ {T} ^ \star = \frac{1}{B} \int\limits _ { 0 } ^ { T } x( s) ds,\ \ \mu _ {T} ^ \star = \frac{1}{D} \int\limits _ { 0 } ^ { T } x( s) ds. $$

Let $ x( t) $ be a diffusion process with drift coefficient $ a $ and diffusion coefficient $ b $, such that $ x( t) $ satisfies the stochastic differential equation

$$ dx( t) = a( t, x( t)) dt + b( t, x( t)) dw( t),\ \ x( 0) = x _ {0} , $$

where $ w $ is a Wiener process. Then, under specific restrictions,

$$ \frac{dP _ {a,b} ^ {T} }{dP _ {a _ {0} ,b } ^ {T} } ( x) = \ \mathop{\rm exp} \left \{ - \int\limits _ { 0 } ^ { T } \frac{a( t, x( t)) - a _ {0} ( t, x( t)) }{b( t, x( t)) } dx( t) \right . + $$

$$ + \left . \frac{1}{2} \int\limits _ { 0 } ^ { T } \frac{a( t, x( t)) - a _ {0} ( t, x( t)) ^ {2} }{b( t, x( t)) } dt \right \} $$

(here $ a _ {0} $ is a fixed coefficient).

Example 10.[edit]

Let

$$ dx( t) = a( t, x( t); \theta ) dt + dw, $$

where $ a $ is a known function and $ \theta $ is an unknown real parameter. If Wiener measure is denoted by $ \mu $, then the likelihood function is

$$ \frac{dP _ \theta ^ {T} }{d \mu } = \ \mathop{\rm exp} \left \{ \int\limits _ { 0 } ^ { T } a( t, x( t); \theta ) dx - \frac{1}{2} \int\limits _ { 0 } ^ { T } a ^ {2} ( t, x( t); \theta ) dt \right \} , $$

and, under regularity conditions, the Cramér–Rao inequality is satisfied: For an estimator $ \tau $ with bias $ \Delta ( \theta ) = {\mathsf E} _ \theta \tau - \theta $,

$$ {\mathsf E} _ \theta | \tau - \theta | ^ {2} \geq \frac{( 1 + {d \Delta } / {d \theta } ) ^ {2} }{ {\mathsf E} _ \theta \int\limits _ { 0 } ^ { T } [ ( \partial / {\partial \theta } ) a( t, x( t); \theta )] ^ {2} dt } + \Delta ^ {2} ( \theta ). $$

If the dependence on $ \theta $ is linear, the maximum-likelihood estimator is

$$ \theta _ {T} ^ \star = \ \frac{\int\limits _ { 0 } ^ { T } a( t, x( t)) dt }{\int\limits _ { 0 } ^ { T } a ^ {2} ( t, x( t)) dt } . $$

References[edit]

[1]	U. Grenander, "Stochastic processes and statistical inference" Ark. Mat. , 1 (1950) pp. 195–277 MR0039202 Zbl 0058.35501 Zbl 0041.45807
[2]	E.J. Hannan, "Time series analysis" , Methuen , London (1960) MR0114281 Zbl 0095.13204
[3]	U. Grenander, M. Rosenblatt, "Statistical analysis of stationary time series" , Wiley (1957) MR0084975 Zbl 0080.12904
[4]	U. Grenander, "Abstract inference" , Wiley (1981) MR0599175 Zbl 0505.62069
[5]	Yu.A. Rozanov, "Infinite-dimensional Gaussian distributions" Proc. Steklov Inst. Math. , 108 (1971) Trudy Mat. Inst. Steklov. , 108 (1968) MR0436304
[6]	I.A. Ibragimov, Yu.A. Rozanov, "Gaussian random processes" , Springer (1978) (Translated from Russian) MR0543837 Zbl 0392.60037
[7]	D.R. Brillinger, "Time series. Data analysis and theory" , Holt, Rinehart & Winston (1975) MR0443257 Zbl 0321.62004
[8]	P. Billingsley, "Statistical inference for Markov processes" , Univ. Chicago Press (1961) MR1531450 MR0123419 Zbl 0106.34201
[9]	R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , 1–2 , Springer (1977–1978) (Translated from Russian) MR1800858 MR1800857 MR0608221 MR0488267 MR0474486 Zbl 1008.62073 Zbl 1008.62072 Zbl 0556.60003 Zbl 0369.60001 Zbl 0364.60004
[10]	A.M. Yaglom, "Correlation theory of stationary and related random functions" , 1–2 , Springer (1987) (Translated from Russian) MR0915557 MR0893393 Zbl 0685.62078 Zbl 0685.62077
[11]	T.M. Anderson, "The statistical analysis of time series" , Wiley (1971) MR0283939 Zbl 0225.62108