Generate and certify synthetic datasets
Five input modes. Five generation engines. Five certificate levels. Every output is SHA-256 fingerprinted and Ed25519 signed — independently verifiable without contacting CertifiedData.
Live execution — no account required
Sandbox certificates carry a scert_ prefix. Claim and upgrade to a persistent cert.v1 once you have a free account. How certification works →
Input modes
POST /api/synth/public/generateSelect a domain template (fraud, healthcare, finance, HR…). CertifiedData applies a pre-validated schema. Fastest path to a certified sandbox dataset. Only engine available without account: light.
{"templateId":"fraud_detection","rows":500,"engine":"light"}POST /api/genesis/planDescribe the dataset in natural language. The API returns a typed JSON schema plan you review before executing. Pipeline: prompt → AI schema planner → JSON schema → generate. Prompt text never drives execution directly.
{"prompt":"500 rows of synthetic patient encounters with diagnosis codes and insurance claims"}POST /api/genesis/schema-runExecute generation from a saved schema ID. First create a schema via POST /api/genesis/schema, then trigger runs against it. Returns a certified output dataset.
{"schema_id":"schema_xxxx","rows":1000,"engine":"ctgan"}POST /api/manifest/uploadUpload a manifest JSON describing the dataset specification, provenance metadata, and generation intent. Used for reproducible multi-job pipelines and CI-driven certification.
{"version":"1","domain":"finance","columns":[{"name":"transaction_id","type":"uuid"},{"name":"amount","type":"float","min":0.01,"max":50000}]}POST /api/notary/certifySubmit any existing artifact (file or hash) for notarization. Issues an Ed25519 attestation certificate without generating new data. First 3 certifications free per account.
{"artifact_type":"dataset","metadata":{"source":"ctgan-v0.7","rows":10000}}Generation engines
| Engine | Plan |
|---|---|
| light | Free (anonymous) |
| gaussian | Build+ |
| ctgan | Trust+ |
| hybrid | Govern+ |
| dp-ctgan | Govern+ |
DP-CTGAN uses active differential privacy accounting with epsilon-bounded noise injection. Do not use light/gaussian/hybrid engines and claim differential privacy — no noise is added.
Certificate levels
| Level | Plan | TTL |
|---|---|---|
| sandbox.cert.v1 | Anonymous | 7 days |
| cert.v1 | Free+ | Permanent |
| cert.v2 | Build+ | Permanent |
| cert.v3 | Trust+ | Permanent |
| cert.v4 | Govern+ | Permanent |
Capability matrix
| Action | Anonymous | Authenticated | Production | Public Proof |
|---|---|---|---|---|
| Template generation | ✓ 10/24h · 1k rows | ✓ 5 jobs/mo · 10k rows | ✓ unlimited | ✓ verify URL |
| Prompt → spec → generate | ✗ auth required | ✓ Build+ | ✓ Build+ | ✓ cert.v1+ |
| Schema-controlled gen | ✗ auth required | ✓ Build+ | ✓ Build+ | ✓ cert.v2+ |
| Manifest upload | ✗ auth required | ✓ Build+ | ✓ Build+ | ✓ cert.v1+ |
| Upload + synthesize | ✗ auth required | ✓ | ✓ | ✓ cert.v1+ |
| Notary / attest | ✓ 3 free | ✓ plan quota | ✓ unlimited | ✓ verify URL |
| CTGAN engine | ✗ plan required | ✗ Trust+ required | ✓ Trust+ | ✓ cert.v1+ |
| DP-CTGAN engine | ✗ plan required | ✗ Govern+ required | ✓ Govern+ | ✓ cert.v4 |
| cert.v1 issuance | ✗ sandbox only | ✓ free+ | ✓ | ✓ registry |
| cert.v4 issuance | ✗ | ✗ Govern+ required | ✓ Govern+ | ✓ full chain |
API examples
curl -X POST https://certifieddata.io/api/synth/public/generate \
-H "Content-Type: application/json" \
-d '{
"templateId": "fraud_detection",
"rows": 100,
"engine": "light"
}'curl -X POST https://certifieddata.io/api/generate/schema \
-H "Authorization: Bearer $CERTIFIEDDATA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"schema": {
"type": "object",
"properties": {
"transaction_id": {"type": "string", "format": "uuid"},
"amount": {"type": "number", "minimum": 1, "maximum": 50000},
"is_fraud": {"type": "boolean"}
}
},
"rows": 1000,
"engine": "ctgan",
"cert_level": "cert.v2"
}'curl -X POST https://certifieddata.io/api/sandbox/claim-and-upgrade \
-H "Authorization: Bearer $CERTIFIEDDATA_API_KEY" \
-H "Content-Type: application/json" \
-d '{"sandbox_cert_id":"scert_xxxx"}'