A numerical characteristic of a probability distribution. The moment of order $ k $(
$ k > 0 $
an integer) of a random variable $ X $
is defined as the mathematical expectation $ {\mathsf E} X ^ {k} $,
if it exists. If $ F $
is the distribution function of the random variable $ X $,
then
$$ \tag{* } {\mathsf E} X ^ {k} = \int\limits _ {- \infty } ^ \infty x ^ {k} d F ( x ) . $$
For the definition of a moment in probability theory, a direct analogy is used with the corresponding idea which plays a major role in mechanics: Formula (*) is defined as the moment of a mass distribution. The first-order moment (a statistical moment in mechanics) of a random variable $ X $ is the mathematical expectation $ {\mathsf E} X $. The value $ {\mathsf E} ( X - a ) ^ {k} $ is called the moment of order $ k $ relative to $ a $, $ {\mathsf E} ( X - {\mathsf E} X ) ^ {k} $ is the central moment of order $ k $. The second-order central moment $ {\mathsf E} ( X - {\mathsf E} X ) ^ {2} $ is called the dispersion (or variance) $ {\mathsf D} X $( the moment of inertia in mechanics). The value $ {\mathsf E} | X | ^ {k} $ is called the absolute moment of order $ k $( absolute moments are also defined for non-integral $ k $). The moments of the joint distribution of random variables $ X _ {1} \dots X _ {n} $( see Multi-dimensional distribution) are defined similarly: For any integers $ k _ {i} \geq 0 $, $ k _ {1} + \dots + k _ {n} = k $, the mathematical expectation $ {\mathsf E} ( X _ {1} ^ {k _ {1} } \dots X _ {n} ^ {k _ {n} } ) $ is called a mixed moment of order $ k $, and $ {\mathsf E} ( X _ {1} - {\mathsf E} X _ {1} ) ^ {k _ {1} } \dots ( X _ {n} - {\mathsf E} X _ {n} ) ^ {k _ {n} } $ is called a central mixed moment of order $ k $. The mixed moment $ {\mathsf E} ( X _ {1} - {\mathsf E} X _ {1} ) ( X _ {2} - {\mathsf E} X _ {2} ) $ is called the covariance and is one of the basic characteristics of dependency between random variables (see Correlation (in statistics)). Many properties of moments (in particular, the inequality for moments) are consequences of the fact that for any random variable $ X $ the function $ g ( k) = \mathop{\rm log} {\mathsf E} | X | ^ {k} $ is convex with respect to $ k $ in each finite interval on which the function is defined; $ ( {\mathsf E} | X | ^ {k} ) ^ {1 / k } $ is a non-decreasing function of $ k $. The moments $ {\mathsf E} X ^ {k} $ and $ {\mathsf E} ( X - a ) ^ {k} $ exist if and only if $ {\mathsf E} | X | ^ {k} < \infty $. The existence of $ {\mathsf E} | X | ^ {k _ {0} } $ implies the existence of all moments of orders $ k \leq k _ {0} $. If $ {\mathsf E} | X _ {i} | ^ {k} < \infty $ for all $ i = 1 \dots n $, then the mixed moments $ {\mathsf E} X _ {1} ^ {k _ {1} } \dots X _ {n} ^ {k _ {n} } $ exist for all $ k _ {i} \geq 0 $, $ k _ {1} + \dots + k _ {n} \leq k $. In some cases, for the definition of moments, the so-called moment generating function is useful — the function $ M ( t ) $ with the moments of the distribution as coefficients in its power-series expansion; for integer-valued random variables this function is related to the generating function $ P ( s ) $ by the relation $ M ( t) = P ( e ^ {t} ) $. If $ {\mathsf E} | X | ^ {k} < \infty $, then the characteristic function $ f ( t ) $ of the random variable $ X $ has continuous derivatives up to order $ k $ inclusively, and the moment of order $ k $ is the coefficient of $ ( it ) ^ {k} / k ! $ in the expansion of $ f ( t ) $ in powers of $ t $,
$$ {\mathsf E} X ^ {k} = \left . ( - i ) ^ {k} \frac{d ^ {k} }{d t ^ {k} } f ( t) \right | _{t=0} . $$
If the characteristic function has a derivative of order $ 2 k $ at zero, then $ {\mathsf E} | X | ^ {2k} < \infty $.
For the connection of moments with semi-invariants see Semi-invariant. If the moments of a distribution are known, then it is possible to make some assertions about the probabilities of deviation of a random variable from its mathematical expectation in terms of inequalities; the best known are the Chebyshev inequality in probability theory
$$ {\mathsf P} \{ | X - {\mathsf E} X | \geq \epsilon \} \leq \ \frac{D X }{\epsilon ^ {2} } ,\ \epsilon > 0 , $$
and its generalizations.
Problems of determining a probability distribution from its sequence of moments are called moment problems (cf. Moment problem). Such problems were first discussed by P.L. Chebyshev (1874) in connection with research on limit theorems. In order that the probability distribution of a random variable $ X $ be uniquely defined by its moments $ \alpha _ {k} = {\mathsf E} X ^ {k} $ it is sufficient, for example, that Carleman's condition be satisfied:
$$ \sum_{k=1}^ \infty \frac{1}{\alpha _ {2k} ^ {1 / 2 k } } = \infty . $$
A similar result even holds for moments of random vectors.
The use of moments in the proof of limit theorems is based on the following fact. Let $ F _ {n} $, $ n = 1 , 2 \dots $ be a sequence of distribution functions, all moments $ \alpha _ {k} ( n ) $ of which are finite, and for each integer $ k \geq 1 $ let
$$ \alpha _ {k} ( n ) \rightarrow \alpha _ {k} \ \textrm{ as } n \rightarrow \infty , $$
where $ \alpha _ {k} $ is finite. Then there is a sequence $ F _ {n _ {i} } $ that weakly converges to a distribution function $ F $ having $ \alpha _ {k} $ as its moments. If the moments determine $ F $ uniquely, then the sequence $ F _ {n} $ weakly converges to $ F $. Based on this is the so-called method of moments (cf. Moments, method of (in probability theory)), used, in particular, in mathematical statistics for the study of the deviation of empirical distributions from theoretical ones, and for the statistical estimation of parameters of a distribution (on sampling moments as well as estimators for moments of a certain distribution see Empirical distribution).
The moment generating function is defined by $ M( t) = {\mathsf E} e ^ {tX} $.
[a1] | W. Feller, "An introduction to probability theory and its applications", 2, Wiley (1971) Chapt. 1 |