CertifiedData.io
EU AI Act · Compliance

EU AI Act Compliance Guide

The EU AI Act creates binding compliance obligations for providers and deployers of high-risk AI systems. This guide covers the primary obligations affecting training data, audit logging, and record-keeping — and how synthetic dataset certification maps to each requirement.

Applicability and Timeline

ProvisionApplies fromWho is affected
GPAI model obligationsAugust 2025Providers of general-purpose AI models
High-risk AI system obligationsAugust 2026Providers and deployers of Annex III systems
Article 10 — Training dataAugust 2026Providers of high-risk AI systems
Article 12 — LoggingAugust 2026Providers and deployers
Article 19 — Record-keepingAugust 2026Providers (10-year retention)
Article 13 — TransparencyAugust 2026Providers toward deployers

Article 10 — Training Data Obligations

Article 10 requires that training datasets for high-risk AI systems meet standards of relevance, representativeness, and absence of errors. Providers must document the origin of each training dataset, the data collection methodology, and any pre-processing applied.

Synthetic datasets satisfy Article 10's provenance requirement when they are accompanied by a certification artifact documenting the generation algorithm, the synthetic sample size, and a cryptographic fingerprint of the output. This creates a verifiable record that the data was programmatically generated — not collected from real-world individuals.

Dataset origin documentation

Certification artifact includes algorithm, timestamp, issuer

Representativeness evidence

Row count, column schema, and statistical metadata in cert

Pre-processing records

CTGAN configuration captured in algorithm_spec field

Data governance policy

CertifiedData public transparency log as evidence artifact

Article 12 — Automatic Logging

Article 12 requires high-risk AI systems to automatically log events throughout their operational lifecycle. At the training data layer, this means retaining tamper-evident records of the dataset generation event — including the exact dataset bytes (via cryptographic fingerprint), the generation timestamp, and the authority that certified the artifact.

CertifiedData's Ed25519-signed certification artifacts are designed as Article 12 compliance records. Each artifact is appended to a public, append-only transparency log. The log is hash-chained so any retrospective modification of records would break the chain — making it tamper-evident by construction.

Article 19 — Record-Keeping

Article 19 requires providers of high-risk AI systems to retain documentation of training datasets, technical specifications, and conformity assessment results for a minimum of ten years following the system's placing on the market.

CertifiedData stores all certification artifacts indefinitely. Because each certificate is cryptographically bound to the exact dataset bytes via SHA-256 hash, retrieving the certificate ten years after issuance still provides verifiable evidence of what dataset was used and how it was generated — precisely the evidence Article 19 requires regulators to be able to audit.

Compliance Checklist

Identify high-risk AI system classifications under Annex III

Document training dataset origins for each AI system

Implement tamper-evident logging per Article 12

Establish 10-year record retention policy per Article 19

Register with national competent authority if required

Complete conformity assessment for Annex III systems

Provide Article 13 transparency information to deployers

Items marked with green indicate where CertifiedData certification directly satisfies the obligation.