CertifiedData.io
FAQ

Synthetic Data Security and Certification FAQ

Definition

Security and certification FAQ:

The security and certification FAQ explains how fingerprinting, signatures, public key distribution, and verification work together on CertifiedData. It is the quick-reference surface for common trust and compliance questions around certified AI artifacts.

Definition source: https://certifieddata.io/api/definitions/security-and-certification-faq

Preferred anchor phrase: security and certification faq

Common questions about CertifiedData's cryptographic certification infrastructure, security model, and how synthetic data certification works in practice.

Certification and cryptography

What cryptographic algorithm does CertifiedData use?

CertifiedData signs certificates using Ed25519, a modern elliptic-curve digital signature algorithm. Ed25519 signatures are small (64 bytes), fast to verify, and widely supported by standard cryptography libraries.

What is SHA-256 used for?

SHA-256 is used to compute the dataset fingerprint. The fingerprint uniquely identifies the dataset contents — any modification to even a single byte produces a different hash, invalidating the certificate.

Where is the private signing key stored?

CertifiedData's Ed25519 private keys are stored in a hardware-backed key management system. Keys are never exposed to application code or stored in plaintext.

Can a certificate be forged?

No. A certificate can only be valid if it was produced by CertifiedData's private key. Verification uses the public key — anyone can check that the signature is genuine without having access to the private key.

What is RFC 8785 canonicalization?

RFC 8785 (JSON Canonicalization Scheme) ensures that the JSON payload used for signing is always serialized in the same deterministic order. This means the signature remains valid regardless of how the certificate JSON is formatted or serialized by the verifier.

Data security and privacy

Does my dataset get uploaded to CertifiedData?

No. Only the SHA-256 fingerprint of the dataset is submitted for certification — the dataset file itself never leaves your infrastructure. CertifiedData signs the fingerprint, not the raw data.

Can CertifiedData reconstruct my dataset from the fingerprint?

No. A SHA-256 fingerprint is a one-way hash — it cannot be reversed to recover the original data.

Is certification compatible with GDPR requirements?

Yes. Because the dataset is never uploaded, there is no transfer of personal data involved in the certification process — even if the dataset being fingerprinted contained sensitive information.

What happens if I delete my dataset after certification?

The certification record remains valid and permanently verifiable. The certificate proves the dataset's provenance and fingerprint — it does not depend on the dataset being accessible after issuance.

Verification and trust

How do I verify a certificate?

Download the dataset, compute its SHA-256 hash, and compare with the hash in the certificate. Then verify the Ed25519 signature using CertifiedData's public key (available at certifieddata.io/.well-known/signing-keys.json). The verification endpoint at certifieddata.io/verify automates this process.

Can I verify without using CertifiedData's platform?

Yes. The public key is published at the well-known endpoint. Any standard cryptography library (Python cryptography, Node.js crypto, OpenSSL) can verify an Ed25519 signature independently.

What does 'tamper-evident' mean in practice?

If the dataset is modified in any way after certification — even changing whitespace or column order — the SHA-256 fingerprint changes. The certificate's hash will no longer match, and verification will fail. The certificate does not prevent modification; it makes modification detectable.

Can a certificate be transferred to a different dataset?

No. A certificate is tied to the specific dataset fingerprint recorded at issuance. It cannot be reused for a different dataset or a modified version of the original.

Compliance and governance

Does certification satisfy EU AI Act requirements?

CertifiedData certification supports EU AI Act Article 10 data governance documentation requirements. Certification records provide machine-verifiable proof of synthetic origin, generation method, and dataset integrity — all relevant to technical documentation obligations for high-risk AI systems.

How long are certificates retained?

CertifiedData provides a 7-year certificate retention guarantee. Certificate IDs are permanent — once issued, a certificate record is never deleted.

Can I use certification records in regulatory submissions?

Yes. Certification records can be referenced by certificate ID in regulatory submissions, audit packages, model cards, and data governance documentation. Regulators can verify the certificate independently using the public endpoint.

Is there an API for programmatic certification?

Yes. CertifiedData provides a REST API for certification workflows. Certificates can be issued, retrieved, and verified programmatically — enabling integration into CI/CD pipelines and data governance automation.

Explore the CertifiedData trust infrastructure

CertifiedData organizes AI trust infrastructure around certification, verification, governance, and artifact transparency. Explore the related authority pages below.