Say that a series is noted when a random event (usually extremely rare) happens several (at least two) times in a relatively short period of time (much shorter than the intuitive average waiting time), without an obvious reason for such untimely repetitions. In colloquial meaning the law of series is the belief that such series happen more often than they should by "pure chance". This belief is usually associated with another, that there exists some unexplained physical force or statistical rule behind this "law".
Contents |
Serial occurrences of certain types of events is perfectly understandable as a result of physical dependence. Various events reveal increased frequency of occurrences in so-called periods of propitious conditions, which in turn, follow a slowly changing process. For example, volcanic eruptions appear in series during periods of increased tectonic activity, floods prevail in periods of global warming. In some other cases, the first occurrence physically provokes subsequent repetitions. A good example here are series of people falling ill due to a contagious disease. The dispute around the law of series clearly concerns only such events for which there are no obvious clustering mechanisms, and they are expected to appear completely independently from each-other, and yet, they do appear in series. With this restriction the law of series belongs to the category of unexplained mysteries, such as synchronicity, telepathy or Murphy's Law, and is often considered a manifestation of paranormal forces that exist in our world and escape scientific explanation. It is a subject of a long-lasting controversy centered around two questions:
This debate has avoided strict scientific language; even its subject is not precisely defined, and it is difficult to imagine appropriate repetitive experiments in a controlled environment. Thus, in this approach, the dispute is probably fated to remain an exchange of speculations.
There is also a scientific approach, embedded in the ergodic theory of stochastic processes. Surprisingly, the study of stochastic processes supports the law of series against the skeptic point of view.
An Austrian biologist dr. Paul Kammerer (1880-1926) was the first scientist to study the law of series (law of seriality, in some translations). His book Das Gesetz der Serie (Kammerer, 1919) contains many examples from his and his nears' lives. Here is a sample:
(22) On July 28, 1915, I experienced the following progressive series: (a) my wife was reading about "Mrs Rohan", a character in the novel Michael by Hermann Bang; in the tramway she saw a man who looked like her friend, Prince Josef Rohan; in the evening Prince Rohan dropped in on us. (b) In the tram she overheard somebody asking the pseudo-Rohan whether he knew the village of Weissenbach at Lake Attersee, and whether it would be a pleasant place for a holiday. When she got out of the tram, she went to the delicatessen shop on the Naschmarkt, where the attendant asked her whether she happened to know Weissenbach on Lake Attersee - he had to make a delivery by mail and did not know the correct postal address.
Richard von Mises in his book (von Mises, 1981) describes that Kammerer conducted many (rather naive) experiments, spending hours in parks noting occurrences of pedestrians with certain features (glasses, umbrellas, etc.) or in shops, noting precise times of arrivals of clients, and the like. Kammerer "discovered", that the number of time intervals (of a fixed length) in which the number of objects under observation agrees with the average is by much smaller than the number of intervals, where that number is either zero or larger than the average. This, he argued, provided evidence for clustering. From today's perspective, Kammerer merely noted the perfectly normal spontaneous clustering of signals in the Poisson process. Nevertheless, Kammerer's book attracted some attention of the public and even of some serious scientists toward the phenomenon of clustering. Kammerer himself lost authority due to accusations of manipulating his biological experiments (unrelated to our topic), which eventually drove him to suicide.
Examples of series are, in the literature, mixed with examples of other kinds of "unbelievable" coincidences. Their list is long and fascinating, but quoting them would lead away from the subject. Pioneer theories about coincidences (including series) were postulated, not only by Kammerer, but also by a noted Swiss psychologist, Carl Gustav Jung (1875-1961), and a Nobel prize winner in physics, Austrian, Wolfgang Pauli (1900-1958). They believed that there exist undiscovered physical "attracting" forces driving objects that are alike, or have common features, closer together in time and space (so-called theory of synchronicity). See Jung's book Synchronicity: An Acausal Connecting Principle.
The law of series and synchronicity interests the investigators of spirituality, magic and parapsychology. It fascinates with its potential to generate "meaningful coincidences". A Frenchman, Jean Moisset (born 1924), a self-educated specialist in parapsychology, wrote a number of books on synchronicity, law of series, and similar phenomena. He connects the law of series with psychokinesis and claims that it is even possible to use it (Moisset, 2000). It is believed that Adolf Hitler, in spite of hiring an astrologist and a fortune-teller to help him plan, also trusted that "forces of synchronicity" could be employed for a purpose.
In opposition to the theory of synchronicity is the belief, represented by many statisticians, among others by an American mathematician, Warren Weaver (1894-1978), that any series, coincidences and the like, appear exclusively by pure chance and that there is no mysterious or unexplained force behind them. Around the world in every instant of time reality combines so many different names, numbers, events, etc., that there is nothing unusual if some combinations considered series or "unbelievable coincidences" occur here and there from time to time. Every such coincidence has nonzero probability, which implies not only that it can, but even must occur, if sufficiently many trials are performed. People's perception has the tendency to ignore all those sequences of events, which do not posses the attribute of being unusual, so that we largely underestimate the enormous number of "failures" accompanying every single "successful" coincidence. Human memory registers coincidences as more frequent simply because they are more distinctive. This is the "mysterious force" behind synchronicity. A similar point of view is explained by Robert Matthews in his essay The laws of freak chance.
With regard to series of repetitions of identical or similar events, the skeptics' argumentation refers to the effect of spontaneous clustering. For an event, to repeat in time by "pure chance" means to follow a trajectory of a Poisson process. In a typical realization of a Poisson process the distribution of signals along the time axis is far from being uniform; the gaps between signals are sometimes bigger, sometimes smaller. Places where several smaller gaps accumulate (which obviously happens here and there along the time axis) can be interpreted as "spontaneous clusters" of signals. It is nothing but these natural clusters that are being observed and over-interpreted as the mysterious "series". It is this kind of "seriality" that has been seen by Kammerer in most of his experiments.
Yet another "cool-minded" explanation of synchronicity (including the law of series) asserts that very often events that seem unrelated (hence should appear independently of each-other) are in fact strongly related. Many "accidental" coincidences or series of similar events, after taking a closer look at the mechanisms behind them, can be logically explained as "not quite accidental". "Ordinary" people simply do not bother to seek the logical connection. After all, it is much more exciting to "encounter the paranormal".
A systematic, purely mathematical approach to the phenomenon can be found in papers of Downarowicz, Lacroix et al. (Downarowicz and Lacroix, 2006, Downarowicz, Lacroix and Leandri, preprint, Downarowicz, Grzegorek and Lacroix, preprint). The law of series is formally defined in terms of ergodic theory of stochastic processes, and linked to the notion of attracting (or clustering) of occurrences of an event. Using entropy theory, it has been proved that in non-deterministic processes for events of certain type (rare cylinder sets) the opposite effect to attracting, i.e., repelling can be at most marginal, while in the majority of processes such events will in fact reveal strong attracting properties. This can be regarded as a positive answer to the question 1 of the general debate. Also some answer to the question 2 can be deduced. The details are described in the following section. Clearly, the theory investigates the behavior of mathematical models, not of reality itself, hence one can continue to speculate whether it applies or not to the law of series in the colloquial meaning.
The starting point is the assertion that the clustering observed for signals arriving completely independently from each-other (i.e., forming a Poisson process) will be considered neutral, i.e., neither attracting nor repelling. Attracting (and repelling) is defined as the deviation of a signal process from the Poisson process toward stronger (weaker) clustering. It turns out that both deviations can be defined in terms of only one variable associated with the signal process, namely with the waiting time. The precise meaning is stated below.
A signal process is a continuous time stochastic process i.e., one-parameter family of random variables \((X_t)_{t\ge 0}\) defined on a probability space \((\Omega,P)\) and assuming integer values, with the following two properties: 1. \(X_0 = 0\) almost surely, 2. the trajectories \(t\mapsto X_t(\omega)\) are almost surely nondecreasing in \(t\ .\) Clearly, the trajectories must have discontinuities (jumps from one integer to a higher one). These jumps are interpreted as signals. A signal process is homogeneous if for any fixed \(s\ge 0\) the finite-dimensional distributions of \((X_t)\) are the same as those of \((X_{t+s}-X_s)\ .\) An example of a homogeneous signal process is the Poisson process.
Given a homogeneous signal process, the waiting time is the random variable defined on \(\Omega\) as the time of the first signal after time 0: \[V(\omega) = \inf\{t: X_t(\omega)\ge 1\}.\]
Assume that \(X_1\) has finite and nonzero expected value, denoted by \(\lambda\) and called the intensity of the process. Let \(F\) denote the distribution function of the waiting time \(V\ .\) It is well known that the waiting time of a Poisson process has exponential distribution, i.e., it satisfies \[F(t) = 1 - e^{-\lambda t}\] (see the derivation). The key notions of attracting and repelling are defined below:
Definition 1. Consider a homogeneous signal process with intensity \(\lambda\ .\) The signals attract each other from a distance \(t>0\ ,\) if \(F(t)< 1-e^{-\lambda t}\ .\) Analogously, the signals repel each other from a distance \(t>0\ ,\) if \(F(t)> 1-e^{-\lambda t}\ .\) The difference \(|1-e^{-\lambda t}-F(t)|\) is called the intensity of attracting (or repelling) at \(t\ .\)
Why is attracting (repelling) defined as above? By elementary properties of homogeneous processes, it is seen that the expected number of signals \(EX_t\) in the interval of time \([0,t]\) equals \(\lambda t\ .\) The value \(F(t)\) is the probability, that there will be at least one signal in this interval. Hence the ratio \(\frac {\lambda t}{F(t)}\) represents the conditional expectation of the number of signals in \([0,t]\) for all these \(\omega\in\Omega\ ,\) for which at least one signal is observed there. We now compare this expected value with an analogous value computed for the Poisson process with the same intensity \(\lambda\ .\) The numerators \(\lambda t\) are the same for both processes. So, this conditional expectation in the process \((X_t)\) is larger than in the Poisson process if and only if \(F(t)< 1-e^{-\lambda t}\ .\) In such case, if we observe the process for time \(t\ ,\) there are two possibilities: either we detect no signals, or, once the first signal occurs, we can expect a larger global number of observed signals than if we were dealing with the Poisson process. The first signal attracts further signals. By stationarity, the same happens in any interval \([s,s+t]\) of length \(t\ ,\) contributing to an increased clustering effect. Repelling is the converse: the first signal lowers the expected number of signals in the observation period, contributing to a decreased clustering, and a more uniform distribution of signals in time.
If a given process reveals attracting from some distance and repelling from another, the tendency to clustering is not clear and depends on the applied time perspective. However, if there is only attracting (without repelling), then at any time scale we shall see the increased clustering. Thus it is natural to say that
Definition 2. The signal process obeys the law of series if the following two conditions hold:
With this definition, all speculations about the law of series in a perfectly independent process are ruled out. However, in practice perfect independence of arriving signals never happens. It is only approximate; in the universe there exist residual connections for any pair of events. Any reasonable interpretation of the mathematical law of series should postulate that these residual dependencies cannot generate repelling, while they can and in many cases do generate attracting of occurrences for certain events. This is exactly what the theorems below say, which can be regarded as an (at least partial) answer to the question 2 in the general debate. They support the hypothesis that in processes occurring in reality (at least in their mathematical models) attracting is a much more common phenomenon than repelling. Moreover, strong attracting prevails for some types of events.
There are so far three major results in this direction. They concern signal processes associated with ergodic measure preserving transformations (here referred to as master processes). In the master process one fixes a measurable set \(B\) of small probability (a so called rare event) and obtains a signal process by letting the signals be the occurrences of the event \(B\) in the realization of the master process. This signal process is homogeneous and has intensity \(\lambda\) equal to the measure of \(B\) (by the ergodic theorem). Although such a signal process has discrete time, for events of very small probability the signals are so rare that the increment of time becomes relatively very small and the time can be considered continuous. Two theorems describe the behavior of such signal processes, where the event \(B\) is a cylinder set, i.e., the occurrence of a fixed finite sequence of symbols (a word) in the symbolic representation of the master process generated by a finite partition. The last result concerns slightly larger events, more fit to modeling some types of experiments in reality.
Theorem 1. (Downarowicz and Lacroix, 2006) Consider an ergodic measure preserving transformation \((X,\mu, T)\) and a finite measurable partition \(\mathcal P\) of \(X\ .\) If the corresponding symbolic system is not deterministic (i.e., has positive entropy) then for every \(\epsilon >0\) the joint measure of all cylinders (words) of length \(n\) which reveal repelling (for any \(t\)) with intensity exceeding \(\epsilon\) converges to zero as \(n\) tends to infinity.
Interpretation: The majority of sufficiently long words do not reveal repelling (other than marginal). Note that Theorem 1 says nothing about attracting. It only claims that repelling decays as \(n\) tends to infinity. This corresponds to postulate 1 in the Definition 2 of the law of series.
Theorem 2. (Downarowicz, Lacroix and Leandri, preprint) In every measure preserving system \((X,\mu, T)\) the symbolic system associated with a finite partition \(\mathcal P\) of \(X\) has, for a typical (in the sense of category) such partition, the following property: There exists a subset of natural numbers of upper density 1, such that all cylinders associated to words of lengths \(n\) from this subset reveal attracting with intensity close to 1.
Interpretation: Although neutral processes cannot be theoretically eliminated (the Poisson process exists), no process in reality fits precisely to this perfect independent model. Perfect independence is only theoretic, and in practice - approximate. This gives room for non-neutral behavior in nearly any process. This non-neutral behavior happens to be the attracting. This corresponds to the postulate 2 in Definition 2. For example, consider a perfectly independent process with finitely many states, (e.g. the process of flipping the coin). Clearly, occurrences of any long word form a Poisson process, hence are neutral. Now perturb slightly the generating partition. Typically the new process will now reveal strong attracting for all words of "special" lengths belonging to a rather large set of integers.
The above theorems have a certain weakness: they deal with signals which are repetitions of one and exactly the same long word. In reality, one is more interested in occurrences of similar events, rather than repetitions of precisely the same event. For example, when noting the repetitions of some meteorological phenomenon, say, a tornado in Denton County, Texas, every tornado differs from the others in many parameters. Yet, they are classified as "the same" event: a tornado in Denton County, Texas. This leads to studying the occurrences of events consisting not of one, but several united cylinders (we will call them composite events). The problem becomes seriously difficult when the number of added cylinders grows exponentially with their length. A special but very natural case of a composite event of this kind occurs when one agrees to identify all cylinders (words) which differ from a specific word \(B\) on a small percentage of coordinates. That is to say, the event is a ball with respect to the Hamming distance in the space of words of a certain length. Such a model fits very well to the type of experiment, where the observed event is positively recognized whenever it is sufficiently similar to some master pattern. The following theorem deals with this case.
Theorem 3. (Downarowicz, Grzegorek and Lacroix, preprint) For every measure preserving system \((X, \mu, T)\) of positive entropy, and every sufficiently small \(\delta>0\) (depending only on the cardinality of the partition) the symbolic system associated with a finite partition \(\mathcal P\) of \(X\) has, for a typical (in the sense of category) such partition, the following property: There exists a subset of natural numbers of upper density 1, such that for every word \(B\) of length \(n\) from this subset the composite event \(B^\delta\) (the \(\delta\)-ball around \(B\) in the Hamming distance) reveals attracting with intensity close to 1.
It must be understood that the above theorems apply to a rather limited variety of events in reality. First, not all events can be described as words with respect to some partition or as balls in the Hamming distance around some words. Most events have the structure of complicated unions of many not similar cylinders and of different lengths. Second, even if an event is a single word, the theorems require that the word be very long. Certainly they do not govern the single numbered outcomes of the roulette spins or of multinumbered lotto drawings, or even incidental repetitions of names of people (several letters in length) encountered by someone during his life.
Nonetheless, these theorems can be applied to some types of phenomena, for example in genetics, computer science, or data transmission, where one deals with really long strings of symbols. In spite of such possible applications, these results have a philosophic meaning: Even if the observed process is believed to be completely independent, due to tiny imperfections of the independence, the occurrences of events of some kind indeed obey the law of series. Thus, the law of series is not only an illusion or an unexplained paranormal phenomenon, but a rigorous statistical law. It is now a matter of further investigation (not speculation), to extend the range of its applicability.
Internal references
Jean Moisset: The Power of Mind
Robert Mathews: The laws of freak chance (dead link)