Tier 3 produces the AI-ready data substrate as a structural byproduct of preservation
The data properties required for defensible artificial-intelligence development — provenance, reproducibility, federation, verification — are the same architectural properties that produce preservation under Tier 3.
- Provenance (what data trained a model and where it came from) is the structural product of content addressing and signed deposit (M-0002, §2.1, §8.2).
- Reproducibility (re-run a training pipeline against the original corpus) requires the corpus to persist across the lifetime of any model trained on it — exactly the preservation property Tier 3 produces (§6.2, §9).
- Federation (train across institutional boundaries without consolidating sensitive data into a single trust domain) is the operational pattern §7.6 documents for permissioned BitTorrent, federated Matrix, and permissioned IPFS clusters; the same architecture HIPAA-, FERPA-, and export-controlled research already requires.
- Verification (demonstrate to a regulator, court, or peer reviewer that training data was what the model card claims) is the architectural property C-0007 develops: a single cryptographic query produces evidence any third party can independently re-verify.
The institutions that operate Tier 3 preservation nodes for the reasons in §§2-9 hold the AI-ready data substrate as a byproduct. The infrastructure investment that hedges the §5 liability also captures the AI dimension. The deployment is two-sided rather than a one-sided hedge.
The competitive position is direct. Federal AI grant programs increasingly emphasize data-management capacity, reproducibility, and access governance in solicitation language (S-0128, S-0120). International competitors — EOSC, EuroHPC, China Science and Technology Cloud — are closing the gap on the same infrastructure axis (S-0130, S-0131, S-0132). Institutions without credible AI-data infrastructure in 2026 cede position they cannot recover by spending more later.