CertifiedData.io
Datasets
DEMO
Anonymous sandboxsandbox.cert.v1 · 7-day TTL

Generate and certify synthetic datasets

Five input modes. Five generation engines. Five certificate levels. Every output is SHA-256 fingerprinted and Ed25519 signed — independently verifiable without contacting CertifiedData.

Live execution — no account required

Live execution — sandboxPOST /api/synth/public/generate
10 anonymous runs / 24h per IP · sandbox.cert.v1 issued · 7-day TTL

Sandbox certificates carry a scert_ prefix. Claim and upgrade to a persistent cert.v1 once you have a free account. How certification works →

Input modes

Template
nonePOST /api/synth/public/generate

Select a domain template (fraud, healthcare, finance, HR…). CertifiedData applies a pre-validated schema. Fastest path to a certified sandbox dataset. Only engine available without account: light.

{"templateId":"fraud_detection","rows":500,"engine":"light"}
Prompt → Schema Plan
Bearer requiredPOST /api/genesis/plan

Describe the dataset in natural language. The API returns a typed JSON schema plan you review before executing. Pipeline: prompt → AI schema planner → JSON schema → generate. Prompt text never drives execution directly.

{"prompt":"500 rows of synthetic patient encounters with diagnosis codes and insurance claims"}
Schema-run
Bearer requiredPOST /api/genesis/schema-run

Execute generation from a saved schema ID. First create a schema via POST /api/genesis/schema, then trigger runs against it. Returns a certified output dataset.

{"schema_id":"schema_xxxx","rows":1000,"engine":"ctgan"}
Manifest upload
Bearer requiredPOST /api/manifest/upload

Upload a manifest JSON describing the dataset specification, provenance metadata, and generation intent. Used for reproducible multi-job pipelines and CI-driven certification.

{"version":"1","domain":"finance","columns":[{"name":"transaction_id","type":"uuid"},{"name":"amount","type":"float","min":0.01,"max":50000}]}
Notary / Certify
3 free (anonymous) or BearerPOST /api/notary/certify

Submit any existing artifact (file or hash) for notarization. Issues an Ed25519 attestation certificate without generating new data. First 3 certifications free per account.

{"artifact_type":"dataset","metadata":{"source":"ctgan-v0.7","rows":10000}}

Generation engines

EnginePlan
lightFree (anonymous)
gaussianBuild+
ctganTrust+
hybridGovern+
dp-ctganGovern+

DP-CTGAN uses active differential privacy accounting with epsilon-bounded noise injection. Do not use light/gaussian/hybrid engines and claim differential privacy — no noise is added.

Certificate levels

LevelPlanTTL
sandbox.cert.v1Anonymous7 days
cert.v1Free+Permanent
cert.v2Build+Permanent
cert.v3Trust+Permanent
cert.v4Govern+Permanent

Capability matrix

ActionAnonymousAuthenticatedProductionPublic Proof
Template generation✓ 10/24h · 1k rows✓ 5 jobs/mo · 10k rows✓ unlimited✓ verify URL
Prompt → spec → generate✗ auth required✓ Build+✓ Build+✓ cert.v1+
Schema-controlled gen✗ auth required✓ Build+✓ Build+✓ cert.v2+
Manifest upload✗ auth required✓ Build+✓ Build+✓ cert.v1+
Upload + synthesize✗ auth required✓ cert.v1+
Notary / attest✓ 3 free✓ plan quota✓ unlimited✓ verify URL
CTGAN engine✗ plan required✗ Trust+ required✓ Trust+✓ cert.v1+
DP-CTGAN engine✗ plan required✗ Govern+ required✓ Govern+✓ cert.v4
cert.v1 issuance✗ sandbox only✓ free+✓ registry
cert.v4 issuance✗ Govern+ required✓ Govern+✓ full chain

API examples

Anonymous sandbox generation
curl -X POST https://certifieddata.io/api/synth/public/generate \
  -H "Content-Type: application/json" \
  -d '{
    "templateId": "fraud_detection",
    "rows": 100,
    "engine": "light"
  }'
Schema-controlled generation (authenticated)
curl -X POST https://certifieddata.io/api/generate/schema \
  -H "Authorization: Bearer $CERTIFIEDDATA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "schema": {
      "type": "object",
      "properties": {
        "transaction_id": {"type": "string", "format": "uuid"},
        "amount": {"type": "number", "minimum": 1, "maximum": 50000},
        "is_fraud": {"type": "boolean"}
      }
    },
    "rows": 1000,
    "engine": "ctgan",
    "cert_level": "cert.v2"
  }'
Claim and upgrade a sandbox cert
curl -X POST https://certifieddata.io/api/sandbox/claim-and-upgrade \
  -H "Authorization: Bearer $CERTIFIEDDATA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"sandbox_cert_id":"scert_xxxx"}'
Agent Datasets — Generate and Certify Synthetic Data | CertifiedData | CertifiedData