Resilient Data Futures
QuestionQ-0025draft

How is the scholarly record itself decaying through reference rot?

§4.22026-05-040 out · 1 in

A subsidiary question under Q-0003. §4.2 extends the cost-to-science argument from individual reproducibility failures to the structural decay of the scholarly record at planetary scale.

The answer is C-0023. Three measurements anchor the decay rate:

  • 25% of all webpages from 2013-2023 are already gone, rising to 38% for pages a decade old (Pew).
  • One in five scientific articles suffers reference rot; among articles citing web content, seven in ten have compromised scholarly context (3.5M-article analysis).
  • More than 70% of URLs cited across a Harvard Law Review sample 1996-2012 no longer resolve to the originally cited content.

These are decay rates, not magnitudes. They imply that the value of every published reference declines monotonically as a function of time-since-publication, and that the scholarly record's net informational content is decreasing in some dimensions even as new publications add to it.

The architectural cause is the same as in §3: cited content depends on continued access through whatever single copy or single platform originally hosted it. Tier 3 alternatives — content-addressed citation, distributed archives, cryptographic snapshots (perma.cc, Internet Archive, Robust Links) — produce reference durability as a structural property rather than as a hope.