How CertifiedData works
A complete pipeline from synthetic generation to cryptographic certification. Every step is logged, signed, and verifiable.
CTGAN Generation Engine
Conditional Tabular GAN trained on your schema. Produces statistically realistic synthetic records with no real individuals in the output.
Also supports statistical noise synthesis and light sampling engines depending on your plan.
SHA-256 Dataset Fingerprinting
Every exported dataset is hashed before signing. The certificate binds the hash, not a file — so verification works regardless of format (CSV, JSON, Parquet).
RFC 8785 JSON Canonicalization Scheme used for deterministic hashing of structured outputs.
Ed25519 Certificate Signing
Certificates are signed with Ed25519 digital signatures. The public key is published in the CertifiedData registry — anyone can verify without contacting us.
Private keys are stored in the certificate_signing_keys table and are never exported.
Tamper-Evident Public Log
Every generation and certification is emitted to a hash-chained public decision log. Any gap or modification is detectable.
Log checkpoints are computed and stored for integrity auditing.
Verification API
POST /api/verify with your dataset and certificate ID. Returns a cryptographic verdict — no account required.
Public verifier rate-limited but unauthenticated. Python, Go, and Rust verifiers are available.
Artifact Registry
All certified artifacts are stored in the registry with full provenance: generation parameters, engine version, schema, and lineage.
GET /api/registry returns the live registry. .well-known/certifieddata-registry.json is a stable alias.