Resilient Data Futures
QuestionQ-0010draft

What is the per-dataset liability summed across an institution's annual publication output?

§5.3.22026-05-040 out · 1 in

A subsidiary question under Q-0003. Where Q-0009 asks the per-dataset version, Q-0010 asks the aggregation: across an institution's full annual publication output, what does the carrying cost of unverifiable data sum to?

The answer at the representative R1 level is C-0005: applying M-0003's four-term formula to a $200M annual R&D R1 producing approximately 3,000 peer-reviewed papers per year, of which 80% (the C-0002 baseline) carry data that cannot be produced on request, the maximum institutional exposure under a full-enforcement scenario is approximately $1.1 billion per year (~$574M Term A + ~$360-420M Term B + ~$172M Term C, with Term D as institutional tail risk).

The R1-level number scales with publication volume; smaller R1s carry proportionally smaller exposure, larger institutions proportionally larger. The methodology — A+B+C+D applied to the unretrievable fraction — is the contestable substance; the headline number is the visible consequence of the methodology.