In computer science, a measure of presortedness of a sequence represents how much work is required to sort the sequence. If the sequence is pre-sorted, sorting the sequence entirely require little work, hence it is expected to have a small measure of presortedness. In particular, the measure of a sorted sequence is 0.
Some sorting algorithms are more efficient on pre-sorted list, as they can use this pre-work into account to avoid duplicate work. The measure of presortedness allows to formalize the notion that an algorithm is optimal for a certain measure.
Let [math]\displaystyle{ \mathbb N^{*} }[/math] represents finite sequence of natural numbers. Let [math]\displaystyle{ m:\mathbb N^*\to\mathbb N }[/math] a function sending sequence of numbers to non-negative numbers. This function is said to be a measure of presortedness if it satisfies the following axioms:
Those axioms are similar to the axioms of measure theory. However, a measure of presortedness is not a measure as in measure theory, since it is not defined over a set but over sequence.
It is well known that an optimal comparison sort requires to compare [math]\displaystyle{ O(n\log n) }[/math] pairs of elements in a sequence. That is, the decision tree complexity is [math]\displaystyle{ O(n\log n) }[/math]. However, when more informations is known about how much a sequence is presorted, more optimal bounds can be devised. This notion is now formalized.
Given a measure of pre-sortedness [math]\displaystyle{ m }[/math], a sequence [math]\displaystyle{ X=\langle x_1,\dots,x_n\rangle }[/math], [math]\displaystyle{ z\ge m(X) }[/math], let [math]\displaystyle{ \mathtt{below}(z,X, m) }[/math] be the set of permutations of [math]\displaystyle{ X }[/math] whose measure is at most [math]\displaystyle{ Z }[/math]. Then let [math]\displaystyle{ \mathtt{below}(X, m)=\mathtt{below}(m(X), X, m) }[/math] the set of permutations whose measure is at most the measure of [math]\displaystyle{ X }[/math].
A sorting algorithm [math]\displaystyle{ S }[/math] which takes as input [math]\displaystyle{ X }[/math] and [math]\displaystyle{ z }[/math] is said to be weakly [math]\displaystyle{ m }[/math]-optimal if its time complexity is [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(z, X, m)|)) }[/math]. That is, the number of operations is logarithmic in the number of possible permutations whose measure is at most [math]\displaystyle{ z }[/math]. This is optimal in the sens that simply selecting an element in a set of size [math]\displaystyle{ N }[/math] requires [math]\displaystyle{ \log(N) }[/math] bits of information.
Similarly, a sorting algorithm which takes as input [math]\displaystyle{ X }[/math] is [math]\displaystyle{ m }[/math]-optimal if its time complexit is [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(z, X, m)|)) }[/math]. When [math]\displaystyle{ m(X) }[/math] is computable in linear time, a weakly [math]\displaystyle{ m }[/math]-optimal algorithm is automatically [math]\displaystyle{ m }[/math]-optimal. However, if [math]\displaystyle{ m }[/math] is more complex, it is possible that computing [math]\displaystyle{ m(X) }[/math] is actully more costly than sorting the sequence in the first place, and thus only an upper bound [math]\displaystyle{ z }[/math] of [math]\displaystyle{ m(X) }[/math] can be used.
Similarly, a decision tree is weakly [math]\displaystyle{ m }[/math]-optimal if its complexity is [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(z, X, m)|)) }[/math] and it is [math]\displaystyle{ m }[/math]-optimal if its complexity [math]\displaystyle{ O(\max(n, \log|\mathtt{below}(X, m)|)) }[/math].
For each measure [math]\displaystyle{ m }[/math], there exists a weakly [math]\displaystyle{ m }[/math]-optimal decision tree.
If the decision tree complexity of [math]\displaystyle{ m }[/math] is [math]\displaystyle{ O(\max(|X|, \log|\mathtt{below}(X, m)|)) }[/math], there exists an [math]\displaystyle{ m }[/math]-optimal decision tree. This decision tree's root compute [math]\displaystyle{ m(X) }[/math]. Once [math]\displaystyle{ m(X) }[/math] is computed in a node, the weakly [math]\displaystyle{ m }[/math]-optimal decision tree for [math]\displaystyle{ m(X) }[/math] is appended at this node.
The existence of the tree does not implie that there exists (weakly) [math]\displaystyle{ m }[/math]-optimal sorting algorithm, as computing the decision tree may be costly.
Here is a list of measures of presortedness [math]\displaystyle{ m }[/math] and related (weakly) [math]\displaystyle{ m }[/math]-optimal algorithms. Let [math]\displaystyle{ X=\langle x_1,\dots,x_n\rangle }[/math] be fixed for the remaining of the section.
Let us introduce a sorting algorithm that will be optimal for multiple measure. Let's call local insertion sort the algorithm that consists in iterating on the sequence and adding each element in a finger tree.
The constant function 0 is the most trivial measure of presortedness. Any sorting function running in [math]\displaystyle{ O(n\log n) }[/math]-time, such as heapsort and mergesort is [math]\displaystyle{ 0 }[/math]-optimal.
The indicator function [math]\displaystyle{ m_{01} }[/math] of sorted sequence is a measure of sortedness. This function sends sorted sequences to 0 and all other sequences to 1. Any algorithm that check whether a list is sorted, and if it is not sorted apply a [math]\displaystyle{ 0 }[/math]-optimal sorting algorithm is [math]\displaystyle{ m_{01} }[/math]-optimal.
The number of runs, [math]\displaystyle{ \mathtt{Runs}(X) }[/math], is the number of increasing sequences in [math]\displaystyle{ X }[/math] minus one. It is equivalently defined as the number of [math]\displaystyle{ i }[/math] such that [math]\displaystyle{ x_{i+1}\lt x_i }[/math]. is a measure of presortedness.
Natural merge sort, which consists in using existing runs and merging them, is a [math]\displaystyle{ \mathtt{Runs} }[/math]-optimal algorithm. The local insertion sort is also Runs-optimal.
The number of inversion in [math]\displaystyle{ X }[/math], [math]\displaystyle{ \mathtt{Inv}(X) }[/math], is the number of pairs [math]\displaystyle{ i\lt j }[/math] such that [math]\displaystyle{ x_i\gt x_j }[/math]. It is a measure of presortedness and the local insertion sort is Inv-optimal.
The mininimum number [math]\displaystyle{ \mathtt{Rem}(X) }[/math] of elements that need to be removed from [math]\displaystyle{ X }[/math] to obtain a sorted sequences. It is a measure of presortedness and the local insertion sort is Rem-optimal.
The radius [math]\displaystyle{ \mathtt{Rad}(X) }[/math] of [math]\displaystyle{ X }[/math][1] is a measure of presortdness.
The following algorithm is Rad-optimal. It consists in iteratively halving the radius until it reaches 0. Halving the radius can be done by three merges of sub-sequences of length [math]\displaystyle{ \mathtt{Rad}(X) }[/math].
Given two measures of presortdness [math]\displaystyle{ m_1 }[/math] and [math]\displaystyle{ m_2 }[/math], the measure [math]\displaystyle{ X\mapsto\min(m_1(X), m_2(X)) }[/math] is also a measure of presortdness. Given two [math]\displaystyle{ m_i }[/math]-optimal sorting algorithms [math]\displaystyle{ S_i }[/math], the algorithm consisting in executing [math]\displaystyle{ S_1 }[/math] and [math]\displaystyle{ S_2 }[/math] in parallel and returning the first output is a [math]\displaystyle{ \min(m_1,m_2) }[/math]-optimal algorithm.
Template:Data structures and algorithms