Article 19 Record-Keeping Obligations
Article 19 of the EU AI Act requires providers of high-risk AI systems to retain comprehensive technical documentation for a minimum of ten years following the system's placement on the market. Training data documentation is explicitly within scope.
The Retention Obligation
"Providers of high-risk AI systems shall keep the technical documentation referred to in Article 11, the EU declaration of conformity, and, where applicable, the certificate issued by the notified body at the disposal of the national competent authorities for a period of ten years after the high-risk AI system has been placed on the market or put into service."
— EU AI Act, Article 19(1)
What Article 19 Requires You to Keep
Origin, composition, collection methodology, and pre-processing steps for every training dataset. For synthetic datasets: the generation algorithm, parameters, and a cryptographic fingerprint of the output.
→ Satisfied by CertifiedData certification artifact
The procedures used to evaluate the AI system, including the datasets used and the metrics applied. Results of evaluation runs must be preserved.
For systems subject to third-party conformity assessment: the full assessment documentation, including the notified body certificate and any conditions or limitations.
Architecture, algorithm design, computational requirements, and system integration details as specified in Article 11 technical documentation.
Operational performance data and any incidents or anomalies identified during deployment, as required by Article 72.
The formal declaration that the high-risk AI system meets EU AI Act requirements, signed by the authorized representative.
Why Ten-Year Retention Requires Cryptographic Binding
Ten-year retention creates a unique technical challenge: the documentation must remain verifiable long after the original dataset may be deleted, the generation infrastructure may be replaced, and the organization structure may change. A textual description of what data was used is not sufficient — an auditor needs cryptographic evidence that the described data was actually the data used.
CertifiedData's certification artifacts solve this with a SHA-256 fingerprint embedded in an Ed25519-signed record. Ten years after issuance:
- →The certificate is still retrievable from the public transparency log
- →The Ed25519 signature is still verifiable using the published public key
- →The SHA-256 hash can be recomputed against the stored dataset to prove identity
- →The generation algorithm and timestamp remain immutably recorded
Retention Requirements by Record Type
| Record type | Minimum retention | CertifiedData approach |
|---|---|---|
| Training dataset documentation | 10 years | Indefinite — certificates stored permanently |
| Test/evaluation records | 10 years | Out of scope — managed by provider |
| Technical documentation (Art. 11) | 10 years | Out of scope — managed by provider |
| Conformity assessment records | 10 years post-market | Out of scope — managed by notified body |
| Post-market monitoring logs | Duration of deployment + 10y | Decision lineage records (separate product) |
| Transparency log entries | Indefinite (Art. 12) | Append-only log — records never deleted |
Preparing for a Regulatory Audit
When a national competent authority requests documentation under Article 19, the expected deliverable is the full Article 11 technical documentation package. For training data, this includes:
All of these are available via the CertifiedData API and public endpoints — no authentication required for regulators to independently verify.