Resilient Data Futures
ClaimC-0023draft

The scholarly record itself is decaying through reference rot

§4.22026-05-032 out · 3 in

Beyond individual reproducibility failures, the aggregate loss is visible in the structural decay of the scholarly record itself. Every dead reference is a broken link between a published claim and the evidence that supported it; at the scale of the scholarly record, the aggregate is a measurable decline in the degree to which research can be built upon.

  • 25% of all webpages from 2013-2023 are already gone, rising to 38% for pages a decade old (E-0038, S-0052).
  • One in five scientific articles suffers reference rot; among articles citing web content, seven in ten have compromised scholarly context (E-0039, S-0053).
  • More than 70% of URLs cited across a sample drawn from the Harvard Law Review and two other Harvard journals between 1996 and 2012 no longer resolve to the originally cited content (E-0040, S-0054).

These figures are decay rates, not decay magnitudes. They imply that the value of every published reference declines monotonically as a function of time-since-publication, and that the scholarly record's net informational content is decreasing in some dimensions even as new publications add to it.

The architectural cause is the same as in §3: cited content depends on continued access through whatever single copy or single platform originally hosted it. When that copy or platform fails (which it does at the C-0011 / E-0006 base rate), the citation breaks and the scholarly chain through it terminates. Tier 3 architectural alternatives — content-addressed citation, distributed archives, cryptographic snapshots — produce reference durability as a structural property rather than as a hope.