CertifiedData.io
AI Governance · Logging

AI System Logging Requirements

Multiple regulatory frameworks — including the EU AI Act, NIST AI RMF, and ISO 42001 — require AI systems to produce verifiable audit logs. This page covers the technical and compliance requirements for AI system logging, with a focus on training data provenance.

Why AI Logging Is Not Optional

The EU AI Act, effective from August 2025 for general-purpose AI systems and August 2026 for high-risk AI, creates binding logging obligations. Article 12 requires automatic logging of operational events; Article 19 requires providers to retain records of training datasets, test procedures, and performance evaluations for ten years.

Beyond the EU, NIST AI RMF Govern 1.7 and ISO 42001 Section 8.4 both require organizations to document the provenance of AI training data and retain evidence of conformity assessments. Logging is therefore a cross-jurisdictional requirement, not a European-only concern.

Required Log Properties

Tamper-evident

Log entries must be cryptographically bound so that any modification is detectable. Hash-chaining or Merkle tree structures satisfy this requirement.

Complete

Logs must capture every relevant event — not just errors. For training data, this includes the generation event, dataset fingerprint, and certification issuance.

Timestamped

Each log entry must carry a verifiable timestamp. Ed25519-signed certificates with ISO-8601 timestamps are accepted by regulatory frameworks as authoritative records.

Attributable

The issuing authority must be identified and their signing key publicly accessible for independent verification.

Retention-compliant

EU AI Act Article 19 requires a ten-year retention period for high-risk AI system documentation.

Independently verifiable

Auditors and regulators must be able to verify log integrity without relying on the AI provider's cooperation.

Logging Synthetic Training Data

Synthetic training datasets require specific logging provisions. Because synthetic data is generated programmatically — not collected from real-world sources — the log must capture the generation algorithm, the configuration parameters, the synthetic sample size, and a cryptographic fingerprint of the resulting dataset.

CertifiedData issues certification artifacts that serve as logging records for synthetic datasets. Each artifact contains the SHA-256 hash of the dataset, the generation algorithm identifier (e.g. CTGAN), the timestamp, and an Ed25519 signature from the CertifiedData certificate authority. This artifact is the evidence that Article 12 and Article 19 require.

Logging Architecture Comparison

Append-only log
Required

CertifiedData public transparency log is append-only and hash-chained

Ed25519 signatures
Recommended

CertifiedData uses Ed25519 for all certificate issuances

Public verification
Required (EU AI Act)

CertifiedData verification requires no authentication

SHA-256 fingerprinting
Required

Every certified dataset is SHA-256 fingerprinted at generation time

Key rotation policy
Recommended

Signing key history preserved at /.well-known/signing-keys.json

10-year retention
Required (Art. 19)

Certificates stored indefinitely on CertifiedData platform