2020 Mathematics Subject Classification: Primary: 60-XX [MSN][ZBL]
A mathematical science in which the probabilities (cf. Probability) of certain random events are used to deduce the probabilities of other random events which are connected with the former events in some manner.
A statement to the effect that the probability of occurrence of a certain event is, say, 1/2, is not in itself valuable, since one is interested in reliable knowledge. Only results which state that the probability of occurrence of a certain event
In order to describe a regular connection between certain conditions
1) The occurrence of event
2) Under the conditions
The frequency of occurrence of event
Statistical relationships, i.e. relationships which may be described by a scheme of type 2) above, were first noted for games of chance such as throwing a die. Statistical relationships concerning births and deaths have been known for a very long time (e.g. the probability of a newborn (human) baby being a boy is 0.515). The end of the 19th century and the first half of the 20th century have witnessed the discovery of a large number of statistical laws in physics, chemistry, biology, and other sciences. It should be noted that statistical laws are also involved in schemes not directly related to the concept of randomness, e.g. in the distribution of digits in tables of functions, etc. (cf. Random and pseudo-random numbers). This fact is utilized, in particular, in the "simulation" of random phenomena (see Statistical experiments, method of).
That methods of probability theory can be used in studying the relationships prevailing in a large number of sciences apparently unrelated to each other is due to the fact that probabilities of occurrence of events invariably satisfy certain simple laws, which will be discussed below (cf. the section: Fundamental concepts in probability theory). The study of the properties of the probability of occurrence of events, based on these simple laws, forms the subject matter of probability theory.
The fundamental concepts in probability theory, as a mathematical discipline, are most simply exemplified within the framework of so-called elementary probability theory. Each trial
"either wi or wj… or wk occurs."
The outcomes
If there are
Formula (2) expresses the so-called classical concept of probability, according to which the probability of some event
Example. Each one of the 36 possible outcomes of throwing a pair of dice may be denoted by
The problem of determining the numerical values of the probabilities
A more detailed and thorough explanation for the causes of equal probabilities of individual outcomes in some special cases may be given by the so-called method of arbitrary functions. The method is explained below by taking again dice throwing as an example. Let the conditions of the trials be such that accidental effects of air on the die are negligible. In such a case, if the initial position, the initial velocity and the mechanical properties of the die are known exactly, the motion of the die may be calculated by the methods of classical mechanics, and the result of the trial may be reliably predicted. In practice, the initial conditions can never be determined with absolute accuracy and even very small changes in the initial velocity will produce a different result, provided the period of time
A second
Both these cases can be seen as part of general ergodic theory.
Given a certain number of events, two new events may be defined: their union (sum) and combination (product, intersection). The event
The event
The symbols for union and intersection of events are
Two events
Two fundamental theorems in probability theory — theorems on addition and multiplication of probabilities — are connected with the operations just introduced.
The theorem on addition of probabilities. If the events
Thus, in the example mentioned above — throwing a pair of dice, "the sum of the dots is 4 or less" is the sum of the three mutually exclusive events
The conditional probability of event
which may be shown to be in complete agreement with the properties of the frequencies of occurrence. Events
The theorem on multiplication of probabilities. The probability of joint occurrence of events
i.e. the probability of joint occurrence of independent events is equal to the product of the probabilities of these events. Formula (3) remains valid if some of the events are replaced in both its parts by the complementary events.
Example. Four shots are fired at a target, the probability of hitting the target being 0.2 with each shot. The hits scored in different shots are considered to be independent events. What will be the probability of hitting the target exactly three times?
Each outcome of a trial can be symbolized by a sequence of four letters (e.g.
where
so that the probability of the event is
A generalization of the above reasoning leads to one of the fundamental formulas in probability theory: If the events
where
An approximate value of the probability
the error not exceeding 0.0009. This result shows that the occurrence of the event
Another fundamental formula in elementary probability theory is the so-called formula of total probability: If events
The theorem on multiplication of probabilities is particularly useful when compound trials are considered. One says that a trial
are, for some reason, known. The data in (5) together with the multiplication theorem may then be used to determine the probabilities
Random variables. If each outcome of a trial
In a joint study of several random variables one introduces the concept of their joint distribution, which is defined by indicating the possible values of each one, and the probabilities of joint occurrence of the events
where
etc.
Often, instead of giving the distribution of a random variable completely, one uses a, not too large, collection of numerical characteristics. The ones most often used are the mathematical expectation and the dispersion (variance). (See also Moment; Semi-invariant.)
The fundamental characteristics of a joint distribution of several random variables include — in addition to the mathematical expectations and the variances of these variables — also the correlation coefficients (cf. Correlation coefficient), etc. The meaning of these characteristics can be made clear, to a considerable extent, by limit theorems (see the section: Limit theorems).
The scheme of trials with a finite number of outcomes proves inadequate even in the simplest applications of probability theory. Thus, in the study of the random dispersion of the hitting sites of projectiles around the centre of a target, or in the study of random errors in the determination of some value, etc., it is not possible to limit the model to trials with a finite number of outcomes. Moreover, such outcomes may, in some cases, be expressed by a number or a set of numbers, while in other cases the outcome of a trial may be a function (e.g. a record of the variation of atmospheric pressure at a given location over a certain period of time), a set of functions, etc. It should be noted that many definitions and theorems given above, after suitable modifications, are also applicable in these more general cases, although the forms in which the probability distribution is presented are different (cf. Density of a probability distribution; Probability distribution). Here, the classical "equal probability of each outcome" is replaced by a uniform distribution of the objects under consideration in some area (this is exactly what is meant when speaking of a point randomly selected in a given area, a randomly selected tangent to some figure, etc.).
Major changes are introduced in the definition of a probability which, in the elementary case, is given by formula (2). In the more general schemes now discussed, the events are the union of an infinite number of elementary events the probability of each one of which may be zero. Thus, the property which is described by the addition theorem is not a consequence of the definition of probability, but is part of it.
The logical scheme of constructing the fundamentals of probability theory which is most often employed was developed in 1933 by A.N. Kolmogorov. The fundamental characteristics of this scheme are the following. In studying a real problem by the methods of probability theory, the first step is to isolate a set
1)
2)
3) if the events
(additivity of probabilities).
In order to construct a mathematically rigorous theory, the domain of definition of
The following comments may be made on the scheme described above. In accordance with the scheme, each probability model is based on a probability space, which is a triplet
Subsequent development of probability theory showed that the above definition of a probability space can be expediently narrowed. These developments have led to concepts such as perfect distributions and probability spaces, Blackwell spaces, Radon probability measures on topological (linear) spaces, etc. (see Probability distribution).
There are also other approaches to the fundamental concepts of probability theory, such as axiomatization, the principal object of which is a normalized Boolean algebra of events. Here, the principal advantage (provided that the algebra being considered is complete in the metric sense) consists of the fact that for any directed system of events the following relations are true:
It is possible to axiomatize the concept of a random variable as an element of some commutative algebra with a positive linear functional defined on it (the analogue of the mathematical expectation). This is the starting point for non-commutative and quantum probability.
In a formal exposition of probability theory limit theorems appear as a kind of superstructure over its elementary sections in which all problems are of a finite, purely arithmetical nature. However, the cognitive value of probability theory can only be revealed by these limit theorems. Thus, it is shown by the Bernoulli theorem that the frequency of occurrence of a given event in independent trials is usually close to its probability, while the Laplace theorem yields the probabilities of deviations of this frequency from its limiting value. In a similar manner, the meaning of the characteristics of a random variable such as its mathematical expectation and variance are explained by the law of large numbers and the central limit theorem (see also Limit theorems in probability theory).
Let
be independent random variables with the same probability distribution, with
In accordance with the law of large numbers, for any
The above statements, with suitable modifications, may be extended to random vectors (in finite-dimensional and in some infinite-dimensional spaces). The independence conditions may be replaced by conditions of a "weak" (in some sense) dependence of the
In applications — in particular, in mathematical statistics and statistical physics — it may be necessary to approximate small probabilities (i.e. probabilities of events of the type
It was noted in the nineteen twenties that quite natural non-normal limit distributions may appear even in schemes of sequences of uniformly-distributed and independent random variables. For instance, let
The principal method of proof of limit theorems is the method of characteristic functions (cf. Characteristic function) (and the related methods of Laplace transforms and of generating functions). In a number of cases it becomes necessary to invoke the theory of functions of a complex variable.
The mechanism of the existence of most limit relationships can be completely understood only in the context of the theory of stochastic processes.
During the past few decades the need to consider stochastic processes (cf. Stochastic process) — i.e. processes with a given probability of their proceeding in a certain manner, arose in certain physical and chemical investigations, along with the study of one-dimensional and higher-dimensional random variables. The coordinate of a particle executing a Brownian motion may serve as an example of a stochastic process. In probability theory a stochastic process is usually regarded as a one-parameter family of random variables
Chronologically, Markov processes were the first to be studied. A stochastic process
Just as the study of continuous deterministic processes is reduced to differential equations involving functions which describe the state of the system, the study of continuous Markov processes can, to a large extent, be reduced to differential or differential-integral equations with respect to the distribution of the probabilities of the process.
Another major subject in the field of stochastic processes is the theory of stationary stochastic processes. The stationary nature of a process, i.e. the fact that its probability relations remain unchanged with time, imposes major restrictions on the process and makes it possible to arrive at several important deductions based on this premise.
A major part of the theory is based only on the assumption of stationarity in a wide sense, viz. that the mathematical expectations
where
Recently a rather large class of processes, the so-called semi-martingales, which serves to solve problems of optimal non-linear filtering, interpolation and extrapolation, has been isolated (cf. Stochastic processes, prediction of; Stochastic processes, filtering of; Stochastic processes, interpolation of). A substantial part of the relevant analytical apparatus is provided by stochastic differential equations, stochastic integrals and martingales. A distinguishing feature of a martingale
The theory of stochastic processes is closely connected with the classical problems on limit theorems for sums of random variables. Distributions which appear as limit distributions in the study of sums of random variables become exact distributions of appropriate characteristics in the theory of stochastic processes. This fact makes it possible to demonstrate many limit theorems with the aid of these associated stochastic processes.
One may finally note that the logically unobjectionable definition of the concepts connected with stochastic processes within the framework of the axiomatics discussed above has always presented and still presents a large number of difficulties of measure-theoretic nature. These are connected, for example, with the definition of probabilistic continuity, differentiability, etc., of stochastic processes (cf. Separable process). This is why monographs on the theory of stochastic processes devote about half their space to the analysis of the development of measure-theoretic constructions.
See also the references to entries on individual subjects of probability theory.
[Brnl] | J. Bernoulli, "Ars conjectandi" , Basle (1713) MR2349550 MR2393219 MR0935946 MR0850992 MR0827905 Zbl 0957.01032 Zbl 0694.01020 Zbl 30.0210.01 |
[Mo] | A. de Moivre, "Doctrine of chances" , Paris (1756) Zbl 0153.30801 |
[La] | P.S. Laplace, "Théorie analytique des probabilités" , Paris (1812) MR2274728 MR1400403 MR1400402 Zbl 1047.01534 Zbl 1047.01533 |
[Ch] | P.L. Chebyshev, "Oeuvres de P.L. Chebyshev", Chelsea, reprint (1961) |
[Lia] | A.M. Liapounoff, "Nouvelle forme du théorème sur la limite de probabilité" , St. Petersburg (1901) |
[Ma] | A.A. Markov, "Studies on a remarkable case of dependent trials" Izv. Akad. Nauk SSSR Ser. 6 , 1 (1907) (In Russian) |
[Ma2] | A.A. Markov, "Wahrscheinlichkeitsrechung" , Teubner (1912) (Translated from Russian) Zbl 39.0292.02 |
[Brnsh] | S.N. Bernshtein, "Probability theory" , Moscow-Leningrad (1946) (In Russian) MR1868030 |
[G] | B.V. Gnedenko, "The theory of probability", Chelsea, reprint (1962) (Translated from Russian) |
[Bo] | A.A. Borovkov, "Wahrscheinlichkeitstheorie" , Birkhäuser (1976) (Translated from Russian) MR0410818 |
[Fe] | W. Feller, "An introduction to probability theory and its applications", 1–2 , Wiley (1957–1971) |
[Po] | H. Poincaré, "Calcul des probabilités" , Gauthier-Villars (1912) MR0924852 MR1190693 Zbl 43.0308.04 |
[Mi] | R. von Mises, "Wahrscheinlichkeitsrechnung und ihre Anwendung in der Statistik und theoretischen Physik" , Wien (1931) Zbl 0002.27701 Zbl 57.0605.14 |
[GK] | B.V. Gnedenko, A.N. Kolmogorov, "Probability theory" , Mathematics in the USSR during thirty years: 1917–1947 , Moscow-Leningrad (1948) pp. 701–727 (In Russian) MR1791851 MR1666260 MR1785765 MR1185485 MR0993962 MR0993956 MR0918766 MR0884522 MR1032004 MR0767301 MR0707275 MR0694048 MR0443006 MR0532056 MR0277014 MR0158418 MR0154305 MR0152411 MR0177425 MR0139186 Zbl 05904374 Zbl 1181.01046 Zbl 0901.60001 Zbl 0917.60002 Zbl 0744.60001 Zbl 0683.60064 Zbl 0669.60082 Zbl 0645.60001 Zbl 0709.60001 Zbl 0658.60001 Zbl 0619.01014 Zbl 0543.60001 Zbl 0532.60001 Zbl 0523.60001 Zbl 0507.60024 Zbl 0523.01001 Zbl 0191.46702 Zbl 0121.25101 Zbl 0117.25104 Zbl 0102.34402 |
[K] | A.N. Kolmogorov, "Probability theory" , Mathematics in the USSR during 40 years: 1917–1957 , 1 , Moscow (1959) (In Russian) MR2740683 MR2068844 MR2101342 MR2014969 MR1751481 MR1542526 MR0993961 MR0861120 MR0779090 MR0735967 MR0353394 MR0314554 MR0242569 MR0243559 MR0158418 MR0152411 MR0131348 MR0043408 |
[K2] | A.N. Kolmogorov, "Foundations of the theory of probability" , Chelsea, reprint (1950) (Translated from Russian) MR0032961 |
[Pr] | Yu.V. Prohorov, "Probability theory" , Springer (1969) (Translated from Russian) MR0251754 Zbl 0939.00029 |
[Bi] | P. Billingsley, "Probability and measure" , Wiley (1979) MR0534323 Zbl 0411.60001 |
[Br] | L.P. Breiman, "Probability" , Addison-Wesley (1968) MR0229267 Zbl 0174.48801 |
[CT] | Y.S. Chow, H. Tercher, "Probability theory. Independence, interchangeability, martingales" , Springer (1978) MR0513230 Zbl 0399.60001 |
[Lo] | M. Loève, "Probability theory" , 1–2 , Springer (1977) MR0651017 MR0651018 Zbl 0359.60001 |
[Ca] | R. Carnap, "The logical foundations of probability" , Univ. Chicago Press (1962) MR184839 Zbl 0044.00107 |
[Fi] | B. de Finetti, "Theory of probability" , 1–2 , Wiley (1974) Zbl 0328.60002 |
[Ba] | H. Bauer, "Probability theory and elements of measure theory" , Holt, Rinehart & Winston (1972) pp. Chapt. 11 (Translated from German) MR0636091 Zbl 0243.60004 |
[LS] | R.Sh. Liptser, A.N. Shiryaev, "Theory of martingales" , Kluwer (1989) (Translated from Russian) MR1022664 Zbl 0728.60048 |
[LS2] | R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , 1–2 , Springer (1977–1978) (Translated from Russian) MR1800858 MR1800857 MR0608221 MR0488267 MR0474486 Zbl 1008.62073 Zbl 1008.62072 Zbl 0556.60003 Zbl 0369.60001 Zbl 0364.60004 |
[PR] | V. Paulaskas, A. Račkauskas, "Approximation theory in the central limit theorem. Exact results in Banach spaces" , Kluwer (1989) (Translated from Russian) |
[GS] | I.I. Gihman, A.V. Skorohod, "The theory of stochastic processes" , I-III , Springer (1974–1979) (Translated from Russian) MR0636254 MR0651015 MR0375463 MR0346882 Zbl 0531.60002 Zbl 0531.60001 Zbl 0404.60061 Zbl 0305.60027 Zbl 0291.60019 |
[D] | E.B. Dynkin, "Markov processes" , 1–2 , Springer (1965) (Translated from Russian) MR0193671 Zbl 0132.37901 |
[W] | A.D. Wentzell, "Limit theorems on large deviations for Markov stochastic processes" , Kluwer (1990) (Translated from Russian) MR1135113 Zbl 0743.60029 |
[S] | A.V. Skorohod, "Random processes with independent increments" , Kluwer (1991) (Translated from Russian) MR1155400 |