Open access multiplies the return on preserved data infrastructure
When the preserved data can be shared openly, the return documented in C-0031 compounds further, because openness converts preserved data into reuse at planet scale.
- Human Genome Project + genomics: Federal investment of $14.5B from 1988-2012; generated $965B in economic impact (E-0069, S-0111).
- Landsat Earth observation imagery: Pre-2008 closed-access policy — max 53 scenes/day downloaded; post-2008 open policy — 5,700 scenes/day, $25.6B/yr economic value (E-0061, S-0112). Same satellites; the access change unlocked the value.
- Protein Data Bank: 88% of 210 new FDA-approved drugs 2010-2016 facilitated by open PDB structures (S-0013a); 100% of 34 cancer drugs 2019-2023 (E-0050, S-0081).
- COVID-19: SARS-CoV-2 genome shared publicly January 10-11, 2020 via virological.org and GISAID; BioNTech Project Lightspeed launched January 27 (17 days later); Pfizer-BioNTech vaccines generated ~$1.9T in global economic value, part of $5.2T across all COVID-19 vaccines (E-0062, S-0113).
- EU FAIR cost: European Commission estimated cost of not having FAIR research data at minimum €10.2 billion per year across the EU (E-0072, S-0114).
Distribution is an architecture for redundancy, not a policy on access. Three orthogonal techniques (client-side encryption, permissioned networks, content-addressing for integrity verification independent of access) compose: sensitive data participates in the same architecture as open data, with encryption and permission layers added on top. Health records (HIPAA), student data (FERPA), export-controlled research, and embargoed datasets each participate in this architecture today.
Resilient infrastructure delivers preservation that allows data to survive long enough for compounding use to materialize. Openness delivers the reuse that realizes the value. The investment in infrastructure pays off in both cases; the open-data multiplier compounds the return where the data can be opened.