Random binary tree

In computer science and probability theory, a random binary tree is a binary tree selected at random from some probability distribution on binary trees. Different distributions have been used, leading to different properties for these trees.

Random binary trees have been used for analyzing the average-case complexity of data structures based on binary search trees. For this application it is common to use random trees formed by inserting nodes one at a time according to a random permutation. Adding and removing nodes directly in a random binary tree will in general disrupt its random structure, but the treap and related randomized binary search tree data structures use the principle of binary trees formed from a random permutation in order to maintain a balanced binary search tree dynamically as nodes are inserted and deleted.

Other distributions on random binary trees include the uniform discrete distribution in which all distinct trees are equally likely, distributions on a given number of nodes obtained by repeated splitting, and trees generated by Galton–Watson processes, for which (unlike the other models) the number of nodes in the tree is not fixed.

For random trees that are not necessarily binary, see random tree.

Background[edit]

A binary tree is a rooted tree in which each node may have up to two children (the nodes directly below it in the tree), and those children are designated as being either left or right. It is sometimes convenient instead to consider extended binary trees in which each node is either an external node with zero children, or an internal node with exactly two children. A binary tree that is not in extended form may be converted into an extended binary tree, with all of the original nodes converted into the internal nodes of the extended binary tree, by adding additional external nodes as children of the nodes of the given tree, so that after this addition the internal nodes all have exactly two children. In the other direction, an extended binary tree with at least one internal node may be converted back into a non-extended binary tree by removing all its external nodes. In this way, these two forms are almost entirely equivalent for the purposes of mathematical analysis, except that the extended form allows a tree consisting of a single external node, which does not correspond to anything in the non-extended form. For the purposes of computer data structures, the two forms differ, as the external nodes of the first form may be represented explicitly as objects in a data structure.^[1]

When the internal nodes of a binary tree are labeled by ordered keys of some type (such as distinct numbers), and the inorder traversal of the tree produces the sorted sequence of keys, the result is a binary search tree. Generally, the external nodes of such a tree remain unlabeled.^[2] Binary trees may also be studied with all nodes unlabeled, or with labels that are not given in sorted order. For instance, the Cartesian tree data structure uses labeled binary trees that are not necessarily binary search trees.^[3]

A random binary tree is a random tree drawn from a certain probability distribution on binary trees. In many cases, these probability distributions are defined using a given set of keys, and describe the probabilities of binary search trees having those keys. However, other distributions are possible, not necessarily generating binary search trees, and not necessarily giving a fixed number of nodes.^[4]

From random permutations[edit]

For any set of numbers (or, more generally, values from some total order), one may form a binary search tree in which each number is inserted in sequence as a leaf of the tree, without changing the structure of the previously inserted numbers. The position into which each number should be inserted is uniquely determined by a binary search in the tree formed by the previous numbers. In the random permutation model of random binary trees, each of these permutations is equally likely.

For instance, if the three numbers (1,3,2) are inserted into a tree in that sequence, the number 1 will sit at the root of the tree, the number 3 will be placed as its right child, and the number 2 as the left child of the number 3. There are six different permutations of the numbers (1,2,3), but only five trees may be constructed from them. That is because the permutations (2,1,3) and (2,3,1) form the same tree. Thus, this tree has probability ${\tfrac {2}{6}}={\tfrac {1}{3}}$ of being generated, whereas the other four trees each have probability ${\tfrac {1}{6}}$ .

Expected depth of a node[edit]

For any fixed choice of a value $x$ in a given set of $n$ numbers, if one randomly permutes the numbers and forms a binary tree from them as described above, the expected value of the length of the path from the root of the tree to $x$ is at most $2\log n+O(1)$ , where " $\log$ " denotes the natural logarithm function and the $O$ introduces big O notation. This follows because the expected number of ancestors of $x$ is by linearity of expectation equal to the sum, over all other values $y$ in the set, of the probability that $y$ is an ancestor of $x$ . And a value $y$ is an ancestor of $x$ exactly when $y$ is the first element to be inserted from the elements in the interval $[x,y]$ . Thus, the values that are adjacent to $x$ in the sorted sequence of values have probability ${\tfrac {1}{2}}$ of being an ancestor of $x$ , the values one step away have probability ${\tfrac {1}{3}}$ , etc. Adding these probabilities for all positions in the sorted sequence gives twice a Harmonic number, leading to the bound above. A bound of this form holds also for the expected search length of a path to a fixed value $x$ that is not part of the given set.^[5]

The longest path[edit]

Although not as easy to analyze as the average path length, there has also been much research on determining the expectation (or high probability bounds) of the length of the longest path in a binary search tree generated from a random insertion order. This length, for a tree with $n$ nodes, is almost surely

{\frac {1}{\beta }}\log n\approx 4.311\log n,

where

\beta

is the unique number in the range

0<\beta <1

satisfying the equation

\displaystyle 2\beta e^{1-\beta }=1.

^[6]

Expected number of leaves[edit]

In the random permutation model, each of the numbers from the set of numbers used to form the tree, except for the smallest and largest of the numbers, has probability ${\tfrac {1}{3}}$ of being a leaf in the tree, because it is a leaf when it inserted after its two neighbors, and any of the six permutations of these two neighbors and it are equally likely. By similar reasoning, the smallest and largest of the numbers have probability ${\tfrac {1}{2}}$ of being a leaf. Therefore, the expected number of leaves is the sum of these probabilities, which for $n\geq 2$ is exactly $(n+1)/3$ .^[7]

Strahler Number[edit]

The Strahler number of a tree is a more sensitive measure of the distance from a leaf in which a node has Strahler number $i$ whenever it has either a child with that number or two children with number $i-1$ . For $n$ -node random binary search trees, simulations suggest that the expected Strahler number is $\log _{3}n+o(\log n)$ . However, only the upper bound $\log _{3}n+O(1)$ has actually been proven.^[8]

Treaps and randomized binary search trees[edit]

In applications of binary search tree data structures, it is rare for the values in the tree to be inserted without deletion in a random order, limiting the direct applications of random binary trees. However, algorithm designers have devised data structures that allow insertions and deletions to be performed in a binary search tree, at each step maintaining as an invariant the property that the shape of the tree is a random variable with the same distribution as a random binary search tree.

If a given set of ordered numbers is assigned numeric priorities (distinct numbers unrelated to their values), these priorities may be used to construct a Cartesian tree for the numbers, a binary tree that has as its inorder traversal sequence the sorted sequence of the numbers and that is heap-ordered by priorities. Although more efficient construction algorithms are known, it is helpful to think of a Cartesian tree as being constructed by inserting the given numbers into a binary search tree in priority order. Thus, by choosing the priorities either to be a set of independent random real numbers in the unit interval, or by choosing them to be a random permutation of the numbers from $1$ to $n$ (where $n$ is the number of nodes in the tree), and by maintaining the heap ordering property using tree rotations after any insertion or deletion of a node, it is possible to maintain a data structure that behaves like a random binary search tree. Such a data structure is known as a treap or a randomized binary search tree.^[9]

Uniformly random binary trees[edit]

The number of binary trees with $n$ nodes is a Catalan number: for $n=1,2,3,\dots$ these numbers of trees are

1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, ... (sequence A000108 in the OEIS).

Thus, if one of these trees is selected uniformly at random, its probability is the reciprocal of a Catalan number. Trees generated from a model in this distribution are sometimes called random binary Catalan trees.^[4] They have expected depth proportional to the square root of $n$ , rather than to the logarithm.^[10] The expected Strahler number of a uniformly random $n$ -node binary tree is $\log _{4}n+O(1)$ , lower than the expected Strahler number of random binary search trees.^[11] In some cases the analysis of random binary trees under the random permutation model can be automatically transferred to the uniform model.^[12]

Due to their large heights, this model of equiprobable random trees is not generally used for binary search trees. However, it has other applications, including:

Modeling the parse trees of algebraic expressions in compiler design.^[13] Here the bound on Strahler number translates into the number of registers needed to evaluate an expression.^[14]
Modeling river networks, the original application for which the Strahler number was developed.^[15]
Modeling possible evolutionary trees for a fixed number of species.^[16]

Galton–Watson process[edit]

The Galton–Watson process describes a family of distributions on trees in which the number of children at each node is chosen randomly, independently of other nodes. For binary trees, two versions of the Galton–Watson process are in use, differing only in whether a binary tree with an external root node is allowed:

In the version where the root node may be external, it is chosen to be internal with some specified probability $p$ or external with probability $1-p$ . If it is internal, its two children are trees generated recursively by the same process.
In the version where the root node must be internal, its left and right children are determined to be internal with probability $p$ or external with probability $1-p$ , independently of each other. In the case where they are internal, they are the roots of trees that are generated recursively by the same process.

Trees generated in this way have been called binary Galton–Watson trees. In the special case where $p={\tfrac {1}{2}}$ they are called critical binary Galton–Watson trees.^[17] This probability marks a phase transition for the binary Galton–Watson process: for $p\leq {\tfrac {1}{2}}$ the resulting tree is almost certainly finite, whereas for $p>{\tfrac {1}{2}}$ it is infinite with positive probability.

Another way to generate the same trees is to make a sequence of coin flips, with probability $p$ of heads and probability $1-p$ of tails, until the first flip at which the number of tails exceeds the number of heads (for the model in which an external root is allowed) or exceeds one plus the number of heads (when the root must be internal), and then use this sequence of coin flips to determine the choices made by the recursive generation process, in depth-first order.^[18] Because the number of internal nodes equals the number of heads in this coin flip sequence, all trees with a given number $n$ of nodes are generated from (unique) coin flip sequences of the same length, and are equally likely, regardless of $p$ . That is, the choice of $p$ affects the variation in the size of trees generated by this process, but for a given size the trees are generated uniformly at random.^[19] Smaller values of $p$ below the critical probability $p={\tfrac {1}{2}}$ will produce trees with a smaller expected size, while larger values of $p$ will produce trees with a larger expected size. For the critical probability $p={\tfrac {1}{2}}$ there is no finite bound on the expected size of trees generated by this process.

Galton–Watson processes were originally developed to study the spread and extinction of human surnames, and have been widely applied more generally to the dynamics of human or animal populations. It has been generalized to models where the probability of being an internal or external node at a given level of the tree (a generation, in the population dynamics application) is not fixed, but depends on the number of nodes at the previous level.^[20]

Binary tries[edit]

Another form of binary tree, the binary trie or digital search tree, has a collection of binary numbers labeling some of its external nodes. The internal nodes of the tree represent prefixes of their binary representations that are shared by two or more of the numbers. The left and right children of an internal node are obtained by extending the corresponding prefix by one more bit, a zero or a one bit respectively. If this extension does not match any of the given numbers, or it matches only one of them, the result is an external node; otherwise it is another internal node. Random binary tries have been studied, for instance for sets of random real numbers generated independently in the unit interval. Despite the fact that these trees may have some empty external nodes, they tend to be more balanced than random binary search trees. More precisely, for $n$ uniformly random real numbers in the unit interval, or more generally for any square-integrable probability distribution the unit interval, the average depth of a node is asymptotically $\log _{2}n$ , and the average height of the whole tree is asymptotically $2\log _{2}n$ . The analysis of these trees can be applied to the computational complexity of trie-based sorting algorithms.^[21]

Random split trees[edit]

Luc Devroye and Paul Kruszewski describe a recursive process for constructing random binary trees with $n$ nodes. It generates a real-valued random variable $x$ in the unit interval $(0,1)$ , assigns the first $xn$ nodes (rounded down to an integer number of nodes) to the left subtree, the next node to the root, and the remaining nodes to the right subtree. Then, it continues recursively using the same process in the left and right subtrees. If $x$ is chosen uniformly at random in the interval, the result is the same as the random binary search tree generated by a random permutation of the nodes, as any node is equally likely to be chosen as root. However, this formulation allows other distributions to be used instead. For instance, in the uniformly random binary tree model, once a root is fixed each of its two subtrees must also be uniformly random, so the uniformly random model may also be generated by a different choice of distribution (depending on $n$ ) for $x$ . As they show, by choosing a beta distribution on $x$ and by using an appropriate choice of shape to draw each of the branches, the mathematical trees generated by this process can be used to create realistic-looking botanical trees.^[22]

Notes[edit]

^ Knuth (1997).
^ Knuth (1973).
^ Vuillemin (1980).
^ ^a ^b Sedgewick & Flajolet (2013).
^ Hibbard (1962); Knuth (1973); Mahmoud (1992), p. 75.
^ Robson (1979); Pittel (1985); Devroye (1986); Mahmoud (1992), pp. 91–99; Reed (2003).
^ Brown & Shubert (1984).
^ Kruszewski (1999).
^ Martínez & Roura (1998); Seidel & Aragon (1996).
^ Knuth (2005), p. 15.
^ Devroye & Kruszewski (1995).
^ Mahmoud (1992), p. 70.
^ Mahmoud (1992), p. 63.
^ Flajolet, Raoult & Vuillemin (1979).
^ Shreve (1966).
^ Aldous (1996).
^ Burd, Waymire & Winn (2000).
^ For the connection between trees and random walks (as generated by random coin flips) see e.g. Section 6, "Walks and trees" pp. 483–486, of Harris (1952).
^ Broutin, Devroye & Fraiman (2020). More generally, every Galton–Watson process, conditioned on producing trees of a certain size, produces the same probability distribution as a critical Galton–Watson process: see section 2 of Kennedy (1975).
^ Jagers (2011).
^ Devroye (1984).
^ Devroye & Kruszewski (1996).

References[edit]

Aldous, David (1996), "Probability distributions on cladograms", in Aldous, David; Pemantle, Robin (eds.), Random Discrete Structures, The IMA Volumes in Mathematics and its Applications, vol. 76, Springer-Verlag, pp. 1–18, doi:10.1007/978-1-4612-0719-1_1
Brown, Gerald G.; Shubert, Bruno O. (1984), "On random binary trees", Math. Oper. Res., 9: 43–65, doi:10.1287/moor.9.1.43
Broutin, Nicolas; Devroye, Luc; Fraiman, Nicolas (April 2020), "Recursive functions on conditional Galton–Watson trees", Random Structures & Algorithms, Wiley, 57 (2): 304–316, doi:10.1002/rsa.20921
Burd, Gregory A.; Waymire, Edward C.; Winn, Ronald D. (February 2000), "A self-similar invariance of critical binary Galton–Watson trees", Bernoulli, 6 (1): 1–21, JSTOR 3318630
Devroye, Luc (1984), "A probabilistic analysis of the height of tries and of the complexity of triesort", Acta Informatica, 21: 229–237, doi:10.1007/BF00264248
Devroye, Luc (1986), "A note on the height of binary search trees", Journal of the ACM, 33 (3): 489–498, doi:10.1145/5925.5930, S2CID 8508624
Devroye, Luc; Kruszewski, Paul (1995), "A note on the Horton-Strahler number for random trees", Information Processing Letters, 56 (2): 95–99, doi:10.1016/0020-0190(95)00114-R
Devroye, Luc; Kruszewski, Paul (1996), "The botanical beauty of random binary trees", in Brandenburg, Franz J. (ed.), Graph Drawing: 3rd Int. Symp., GD'95, Passau, Germany, September 20-22, 1995, Lecture Notes in Computer Science, vol. 1027, Springer-Verlag, pp. 166–177, doi:10.1007/BFb0021801, ISBN 978-3-540-60723-6
Drmota, Michael (2009), Random Trees : An Interplay between Combinatorics and Probability, Springer-Verlag, ISBN 978-3-211-75355-2
Flajolet, P.; Raoult, J. C.; Vuillemin, J. (1979), "The number of registers required for evaluating arithmetic expressions", Theoretical Computer Science, 9 (1): 99–125, doi:10.1016/0304-3975(79)90009-4
Harris, T. E. (1952), "First passage and recurrence distributions", Transactions of the American Mathematical Society, American Mathematical Society (AMS), 73 (3): 471–486, doi:10.1090/s0002-9947-1952-0052057-2
Hibbard, Thomas N. (1962), "Some combinatorial properties of certain trees with applications to searching and sorting", Journal of the ACM, 9 (1): 13–28, doi:10.1145/321105.321108
Jagers, Peter (2011), "Extinction, persistence, and evolution", in Chalub, Fabio A. C. C.; Rodrigues, José Francisco (eds.), The Mathematics of Darwin's Legacy, Mathematics and Biosciences in Interaction, Basel: Birkhäuser, pp. 91–104, doi:10.1007/978-3-0348-0122-5_5, ISBN 9783034801225
Kennedy, Douglas P. (1975), "The Galton–Watson process conditioned on the total progeny", Journal of Applied Probability, 12: 800–806, doi:10.2307/3212730
Knuth, Donald E. (1973), "6.2.2 Binary Tree Searching", The Art of Computer Programming, Vol. III: Sorting and Searching, Addison-Wesley, pp. 422–451
Knuth, Donald E. (1997), "2.3.4.5 Path Length", The Art of Computer Programming, Vol. I: Seminumerical Algorithms (3rd ed.), Addison-Wesley, pp. 399–406
Knuth, Donald E. (2005), "Draft of Section 7.2.1.6: Generating All Trees", The Art of Computer Programming, vol. IV
Kruszewski, Paul (1999), "A note on the Horton-Strahler number for random binary search trees", Information Processing Letters, 69 (1): 47–51, doi:10.1016/S0020-0190(98)00192-6
Mahmoud, Hosam M. (1992), Evolution of Random Search Trees, John Wiley & Sons
Martínez, Conrado; Roura, Salvador (1998), "Randomized binary search trees", Journal of the ACM, 45 (2): 288–323, CiteSeerX 10.1.1.17.243, doi:10.1145/274787.274812, S2CID 714621
Pittel, B. (1985), "Asymptotical growth of a class of random trees", Annals of Probability, 13 (2): 414–427, doi:10.1214/aop/1176993000
Reed, Bruce (2003), "The height of a random binary search tree", Journal of the ACM, 50 (3): 306–332, doi:10.1145/765568.765571, S2CID 6169610
Robson, J. M. (1979), "The height of binary search trees", Australian Computer Journal, 11: 151–153
Seidel, Raimund; Aragon, Cecilia R. (1996), "Randomized search trees", Algorithmica, 16 (4–5): 464–497, doi:10.1007/s004539900061
Sedgewick, Robert; Flajolet, Philippe (2013), "Chapter 6: Trees", An Introduction to the Analysis of Algorithms (2nd ed.), Addison-Wesley, ISBN 9780133373486
Shreve, Ronald L. (January 1966), "Statistical law of stream numbers", The Journal of Geology, 74 (1): 17–37, JSTOR 30075174
Vuillemin, Jean (1980), "A unifying look at data structures", Communications of the ACM, New York, NY, USA: ACM, 23 (4): 229–239, doi:10.1145/358841.358852, S2CID 10462194

External links[edit]

Open Data Structures - Chapter 7 - Random Binary Search Trees, Pat Morin

[FOOTNOTEKnuth1997-1] Knuth (1997).

[FOOTNOTEKnuth1973-2] Knuth (1973).

[FOOTNOTEVuillemin1980-3] Vuillemin (1980).

[FOOTNOTESedgewickFlajolet2013-4] Sedgewick & Flajolet (2013).

[5] Hibbard (1962); Knuth (1973); Mahmoud (1992), p. 75.

[6] Robson (1979); Pittel (1985); Devroye (1986); Mahmoud (1992), pp. 91–99; Reed (2003).

[FOOTNOTEBrownShubert1984-7] Brown & Shubert (1984).

[FOOTNOTEKruszewski1999-8] Kruszewski (1999).

[9] Martínez & Roura (1998); Seidel & Aragon (1996).

[FOOTNOTEKnuth200515-10] Knuth (2005), p. 15.

[FOOTNOTEDevroyeKruszewski1995-11] Devroye & Kruszewski (1995).

[FOOTNOTEMahmoud199270-12] Mahmoud (1992), p. 70.

[FOOTNOTEMahmoud199263-13] Mahmoud (1992), p. 63.

[FOOTNOTEFlajoletRaoultVuillemin1979-14] Flajolet, Raoult & Vuillemin (1979).

[FOOTNOTEShreve1966-15] Shreve (1966).

[FOOTNOTEAldous1996-16] Aldous (1996).

[FOOTNOTEBurdWaymireWinn2000-17] Burd, Waymire & Winn (2000).

[18] For the connection between trees and random walks (as generated by random coin flips) see e.g. Section 6, "Walks and trees" pp. 483–486, of Harris (1952).

[19] Broutin, Devroye & Fraiman (2020). More generally, every Galton–Watson process, conditioned on producing trees of a certain size, produces the same probability distribution as a critical Galton–Watson process: see section 2 of Kennedy (1975).

[FOOTNOTEJagers2011-20] Jagers (2011).

[FOOTNOTEDevroye1984-21] Devroye (1984).

[FOOTNOTEDevroyeKruszewski1996-22] Devroye & Kruszewski (1996).

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]