CertifiedData.io
Platform

How CertifiedData works

A complete pipeline from synthetic generation to cryptographic certification. Every step is logged, signed, and verifiable.

CTGAN Generation Engine

Conditional Tabular GAN trained on your schema. Produces statistically realistic synthetic records with no real individuals in the output.

Also supports statistical noise synthesis and light sampling engines depending on your plan.

SHA-256 Dataset Fingerprinting

Every exported dataset is hashed before signing. The certificate binds the hash, not a file — so verification works regardless of format (CSV, JSON, Parquet).

RFC 8785 JSON Canonicalization Scheme used for deterministic hashing of structured outputs.

Ed25519 Certificate Signing

Certificates are signed with Ed25519 digital signatures. The public key is published in the CertifiedData registry — anyone can verify without contacting us.

Private keys are stored in the certificate_signing_keys table and are never exported.

Tamper-Evident Public Log

Every generation and certification is emitted to a hash-chained public decision log. Any gap or modification is detectable.

Log checkpoints are computed and stored for integrity auditing.

Verification API

POST /api/verify with your dataset and certificate ID. Returns a cryptographic verdict — no account required.

Public verifier rate-limited but unauthenticated. Python, Go, and Rust verifiers are available.

Artifact Registry

All certified artifacts are stored in the registry with full provenance: generation parameters, engine version, schema, and lineage.

GET /api/registry returns the live registry. .well-known/certifieddata-registry.json is a stable alias.