On the irreducible cost of sorting by comparison

Enuntiatio

Propositio I. Theorema I. Let A be any deterministic algorithm that sorts n distinct elements using only pairwise comparisons of the form “is x less than y?” to determine the output order. Then there exists an input upon which A performs at least ceil(log2(n!)) such comparisons. That is, the worst-case comparison count W(n) of any algorithm in the comparison model satisfies W(n) >= ceil(log2(n!)).

Corollarium. Since ceil(log2(n!)) = Theta(n log n), no comparison sort can run in worst-case time asymptotically below n log n. The cost is not an artefact of any particular method; it is a property of the comparison model itself, set by the quantity of information n! orderings demand.

Expressio

import math


def comparison_sort_lower_bound(n: int) -> int:
    """Worst-case minimum number of comparisons to sort n distinct elements
    in the comparison model: ceil(log2(n!)).

    Computed by exact summation log2(n!) = sum_{k=2}^{n} log2(k), which is
    numerically stable and avoids overflow of n! for large n.

    >>> comparison_sort_lower_bound(0)
    0
    >>> comparison_sort_lower_bound(1)
    0
    >>> comparison_sort_lower_bound(2)
    1
    >>> comparison_sort_lower_bound(3)   # log2(6) = 2.585...
    3
    >>> comparison_sort_lower_bound(4)   # log2(24) = 4.585...
    5
    >>> comparison_sort_lower_bound(5)   # log2(120) = 6.907...
    7
    >>> comparison_sort_lower_bound(100)
    525
    """
    if n < 0:
        raise ValueError("n must be non-negative")
    if n <= 1:
        return 0  # 0 or 1 element: already sorted, no comparison needed
    total = 0.0
    for k in range(2, n + 1):
        total += math.log2(k)
    # log2(k) for integer k is exact only at powers of two; the running sum
    # carries rounding. ceil is robust here because log2(n!) is irrational
    # (hence bounded away from any integer) for every n >= 3, and exact for
    # n <= 2 where the sum is computed without error.
    return math.ceil(total)


def _log2_factorial_fast(n: int) -> float:
    """Alternative: log2(n!) via the log-gamma function in O(1) float ops.

    Uses lgamma(n+1) = ln(n!), converted to base 2. A tiny negative epsilon
    guards the only exact-integer cases (n <= 2), where floating error could
    otherwise round log2(n!) up past its true integer value. For n >= 3 the
    value is irrational and the guard is immaterial.
    """
    if n <= 1:
        return 0.0
    return math.lgamma(n + 1) / math.log(2.0) - 1e-9


# --- Inline demonstrations of correctness -----------------------------------
assert comparison_sort_lower_bound(3) == 3
assert comparison_sort_lower_bound(4) == 5
assert comparison_sort_lower_bound(5) == 7
assert comparison_sort_lower_bound(12) == 29
# The two methods agree across a wide range:
for _n in (2, 3, 5, 8, 13, 21, 64, 100, 500):
    assert comparison_sort_lower_bound(_n) == math.ceil(_log2_factorial_fast(_n)), _n

# The bound is an INFORMATION bound, not an algorithm: it is a floor, and a
# real sort (e.g. merge sort) lies above it. Merge sort's worst case never
# undercuts the bound:
def _merge_sort_comparisons(a):
    """Return (sorted_list, comparison_count) for a stable merge sort."""
    if len(a) <= 1:
        return list(a), 0
    mid = len(a) // 2
    left, cl = _merge_sort_comparisons(a[:mid])
    right, cr = _merge_sort_comparisons(a[mid:])
    merged, c = [], cl + cr
    i = j = 0
    while i < len(left) and j < len(right):
        c += 1                       # one comparison per merge step
        if left[i] <= right[j]:
            merged.append(left[i]); i += 1
        else:
            merged.append(right[j]); j += 1
    merged.extend(left[i:]); merged.extend(right[j:])
    return merged, c

import itertools
for _n in range(2, 8):
    worst = max(_merge_sort_comparisons(list(p))[1]
                for p in itertools.permutations(range(_n)))
    assert worst >= comparison_sort_lower_bound(_n), (_n, worst)

if __name__ == "__main__":
    import doctest
    doctest.testmod(verbose=False)
    print("ceil(log2(n!)) for n=1..10:",
          [comparison_sort_lower_bound(n) for n in range(1, 11)])

Demonstratio

We argue by the decision tree of the algorithm — the mechanism through which any comparison sort is forced to spend its comparisons.

Lemma I (the tree). Fix the deterministic algorithm A and the input size n. Run A on every possible ordering of the n distinct elements. Each comparison A makes has two outcomes (“less” or “not less”); A’s next action depends only on outcomes seen so far. We therefore model the execution as a rooted binary tree: each internal node is one comparison, its two edges are the two outcomes, and each path from the root descends by the answers A receives. A leaf is reached when A halts and emits a permutation as its answer. The number of comparisons A performs on a given input equals the length of the root-to-leaf path taken, and the worst-case count W(n) equals the height h of the tree (the longest such path).

Lemma II (every ordering needs its own leaf). There are n! distinct orderings of n distinct elements. Consider two inputs whose correct sorted outputs require different permutations to be applied. Suppose, for contradiction, both inputs reached the same leaf. A leaf emits one fixed permutation, so A would apply the identical reordering to both — but the correct outputs differ, so at least one would be left unsorted. Since A is a correct sort, this cannot happen: distinct required permutations must end at distinct leaves. Hence the tree has at least n! reachable leaves. (They are exactly n! when A wastes no comparisons, but >= n! suffices.)

Lemma III (a binary tree of height h has at most 2^h leaves). By induction on h. Height 0: a single node, 2^0 = 1 leaf. Inductive step: a binary tree of height h consists of a root with at most two subtrees, each of height at most h-1, so it has at most 2 * 2^(h-1) = 2^h leaves. This bounds leaves above by 2^h.

Composing the bound. Let L be the number of leaves and h the height. By Lemma II, L >= n!. By Lemma III, L <= 2^h. Therefore 2^h >= L >= n!, and taking base-2 logarithms (monotone), h >= log2(n!). Because h is an integer, W(n) = h >= ceil(log2(n!)). This holds for every deterministic comparison sort A, establishing the Proposition.

That ceil(log2(n!)) = Theta(n log n). We bound log2(n!) = sum_{k=1}^{n} log2(k) from both sides without recourse to Stirling’s full series.

Upper bound. Each term log2(k) <= log2(n), so sum_{k=1}^{n} log2(k) <= n log2(n). Thus log2(n!) = O(n log n).

Lower bound. Discard the smaller half of the factors and keep the top half: for the indices k in the range [n/2, n] (there are at least n/2 of them), each satisfies log2(k) >= log2(n/2) = log2(n) - 1. Hence log2(n!) >= (n/2)(log2(n) - 1) = (1/2) n log2(n) - n/2. For n >= 4 this is at least (1/4) n log2(n), so log2(n!) = Omega(n log n).

Both bounds together give ceil(log2(n!)) = Theta(n log n), confirmed numerically by the sandwich (n/2)log2(n/2) <= log2(n!) <= n log2(n) for n = 4, 16, 64, 256, 1024. Combined with the Proposition, every comparison sort costs Omega(n log n) comparisons in the worst case, and this floor is met to within a constant factor by merge sort and heapsort — so the bound is tight in order of growth.

Q.E.D.

Scholium. The bound governs the worst case of deterministic comparison sorts. It does not bind algorithms that read the keys (counting sort, radix sort), which leave the comparison model and so may run in linear time on bounded keys — they violate no hypothesis here, for they pay no comparisons. Nor does randomization escape it: an averaging argument over the input distribution yields the same Omega(n log n) lower bound on expected comparisons. The result is thus a law of the model, not of any machine — “hypotheses non fingo”: we assume only that order is learned by comparison, and the n log n floor follows by counting alone.