EU AI Act Compliance Guide
The EU AI Act creates binding compliance obligations for providers and deployers of high-risk AI systems. This guide covers the primary obligations affecting training data, audit logging, and record-keeping — and how synthetic dataset certification maps to each requirement.
Applicability and Timeline
| Provision | Applies from | Who is affected |
|---|---|---|
| GPAI model obligations | August 2025 | Providers of general-purpose AI models |
| High-risk AI system obligations | August 2026 | Providers and deployers of Annex III systems |
| Article 10 — Training data | August 2026 | Providers of high-risk AI systems |
| Article 12 — Logging | August 2026 | Providers and deployers |
| Article 19 — Record-keeping | August 2026 | Providers (10-year retention) |
| Article 13 — Transparency | August 2026 | Providers toward deployers |
Article 10 — Training Data Obligations
Article 10 requires that training datasets for high-risk AI systems meet standards of relevance, representativeness, and absence of errors. Providers must document the origin of each training dataset, the data collection methodology, and any pre-processing applied.
Synthetic datasets satisfy Article 10's provenance requirement when they are accompanied by a certification artifact documenting the generation algorithm, the synthetic sample size, and a cryptographic fingerprint of the output. This creates a verifiable record that the data was programmatically generated — not collected from real-world individuals.
Certification artifact includes algorithm, timestamp, issuer
Row count, column schema, and statistical metadata in cert
CTGAN configuration captured in algorithm_spec field
CertifiedData public transparency log as evidence artifact
Article 12 — Automatic Logging
Article 12 requires high-risk AI systems to automatically log events throughout their operational lifecycle. At the training data layer, this means retaining tamper-evident records of the dataset generation event — including the exact dataset bytes (via cryptographic fingerprint), the generation timestamp, and the authority that certified the artifact.
CertifiedData's Ed25519-signed certification artifacts are designed as Article 12 compliance records. Each artifact is appended to a public, append-only transparency log. The log is hash-chained so any retrospective modification of records would break the chain — making it tamper-evident by construction.
Article 19 — Record-Keeping
Article 19 requires providers of high-risk AI systems to retain documentation of training datasets, technical specifications, and conformity assessment results for a minimum of ten years following the system's placing on the market.
CertifiedData stores all certification artifacts indefinitely. Because each certificate is cryptographically bound to the exact dataset bytes via SHA-256 hash, retrieving the certificate ten years after issuance still provides verifiable evidence of what dataset was used and how it was generated — precisely the evidence Article 19 requires regulators to be able to audit.
Compliance Checklist
Identify high-risk AI system classifications under Annex III
Document training dataset origins for each AI system
Implement tamper-evident logging per Article 12
Establish 10-year record retention policy per Article 19
Register with national competent authority if required
Complete conformity assessment for Annex III systems
Provide Article 13 transparency information to deployers
Items marked with green indicate where CertifiedData certification directly satisfies the obligation.