Theorem — Convergence (classical, Euler/Dirichlet)
Σm=1N φ(m) / Σm=1N m → 6/π² = 1/ζ(2) = ∏p(1 − 1/p²) as N → ∞
Why. By Mertens’ summatory totient theorem: Σφ(m) ~ 3N²/π². And Σm = N(N+1)/2 ~ N²/2. So the ratio ~ (3N²/π²) / (N²/2) = 6/π². The error term Er(N) ~ 2δ(N)/N² where δ(N) = Σφ(m) − 3N²/π².
Theorem — Exact step formula (provable, verified to N = 100,000)
Er increases at step M ⇔ φ(M)/M > current ratio ⇔ φ(M)/M > Σφ/Σm
Proof. After adding M: new ratio = (Σφ + φ(M)) / (Σm + M). This exceeds the old ratio iff (Σφ + φ(M))·Σm > Σφ·(Σm + M), i.e. φ(M)·Σm > Σφ·M, i.e. φ(M)/M > Σφ/Σm. □
At prime p: φ(p)/p = (p−1)/p. Once the running ratio is below (p−1)/p (which it is for all but the smallest primes), every prime pushes Er upward. This is why 9,590 out of 9,592 primes to N = 100,000 increase the ratio — the two exceptions are p = 2 and p = 3 where the ratio starts above (p−1)/p.
Theorem — Primorial minima via Mertens’ third theorem
φ(p#)/p# = ∏q≤p(1 − 1/q) ≈ e−γ / log(p) → 0
Why primorials are local minima. At M = p# (primorial), φ(M)/M takes its minimum value among all integers with the same number of prime factors. Each new prime q multiplies M by q and multiplies φ(M)/M by (1−1/q) < 1. Mertens’ third theorem says the product ∏(1−1/p) over primes up to x behaves like e−γ/log(x) as x → ∞, where γ ≈ 0.5772 is the Euler–Mascheroni constant. So φ(p#)/p# → 0, and every primorial creates a downward spike in the running ratio — a structural local minimum.
Theorem — GCD distribution (classical)
P(gcd(r, M) = g) → 6/(π² g²) = 1/(ζ(2) g²) as M → ∞
The full non-coprime structure is encoded in ζ(2): the fraction of pairs (r, M) with gcd = g converges to 6/(π²g²). Summing over all g gives Σ 6/(π²g²) = (6/π²)·ζ(2) = 1, as required. The g = 1 term is the coprime density 6/π² itself. Chart 06 shows this decomposition live.
Decay rate — known bounds, open question
Er(N) = O((log N)2/3 / N) unconditional (Walfisz 1963)
Er(N) = O(N−3/2+ε) conditional on RH
Structure. Er(N) ~ 2δ(N)/N² where δ(N) = Σφ(m) − 3N²/π². Walfisz (1963) proved δ(N) = O(N·(log N)2/3·(log log N)1/3), giving Er(N) = O((log N)2/3/N) unconditionally. Under RH, δ(N) = O(N1/2+ε), giving Er(N) = O(N−3/2+ε) — much faster than O(1/N).
Empirically. |Er|·N oscillates between 0.007 and 0.719 across windows of 10,000 steps — consistent with O((log N)2/3/N) but not settling to a clean limit. The OLS slope on log|Er| vs log N sits near −1 with R² ≈ 0.02 in the second half — the oscillation dominates the fit. Bounded |Er|·N is the stronger empirical evidence than any slope estimate.
Open question. The exact constant in the Walfisz bound is not known. Whether the empirical oscillation envelope can be sharpened analytically connects directly to the distribution of zeros of ζ(s).