CertifiedData.io

Use Cases · Agent Commerce by CertifiedData

Certified synthetic dataset purchases

When an agent buys a certified synthetic dataset, the payment record should not be separate from the dataset trust record. CertifiedData binds them: one signed receipt simultaneously proves the dataset is synthetic, unchanged, and paid for.

Independent certification matters

The certificate is issued by CertifiedData — not the seller. Buyers can verify the artifact fingerprint and certification record independently, without relying on seller claims alone. CertifiedData is the notary, not the counterparty.

Self-attestationThird-party certification
"Trust us, it's synthetic"Independent certificate signed by cd_root_2026
Verify by calling the sellerVerify at a public URL — no seller contact required
Record lives in seller's systemsCertificate is portable and durable after vendor relationships change

What the buyer receives

After a successful execute, the buyer has three independently verifiable records — all cryptographically bound to each other.

01

The dataset

The parquet, CSV, or JSON file. SHA-256 hash matches artifact_hash in the receipt.

sha256sum file → matches artifact_hash

02

The certificate

A machine-verifiable certificate proving the dataset is synthetically generated and has not been tampered with since certification.

GET /api/verify/:certificate_id

03

The payment receipt

An Ed25519-signed receipt proving the spend was policy-approved, and referencing the certificate_id and artifact_hash directly.

GET /api/payments/verify/:receipt_id

A compliance team can click two public URLs and prove: the data is synthetic (certificate) and paid for (receipt). No vendor calls. No PDFs. No trust required.

How it works

Three phases: certify the dataset, run the payment with provenance attached, verify independently.

Phase 1 · Seller certifies the dataset

The seller runs the dataset through CertifiedData's certification API. This produces acertificate_idand registers the SHA-256 hash in the artifact registry with an Ed25519 signature.

# Seller certifies their synthetic dataset
dataset_sha256 = sha256_file("synthetic_users_10k.parquet")

# Step 1: register the artifact in the registry
entry = requests.post("https://certifieddata.io/v1/registry",
    headers={"Authorization": f"Bearer {CD_API_KEY}"},
    json={
        "artifact_type": "synthetic_dataset",
        "sha256": dataset_sha256,
        "metadata": { "rows": 10000, "generation_method": "CTGAN v3" }
    }).json()

registry_id = entry["id"]

# Step 2: certify the registry entry (produces Ed25519-signed certificate)
cert = requests.post(
    f"https://certifieddata.io/v1/registry/{registry_id}/certify",
    headers={"Authorization": f"Bearer {CD_API_KEY}"}
).json()

certificate_id = cert["certification_id"]
# e.g. "cert_syn_7f83b165"
Phase 2 · Buyer's agent pays with provenance attached

The buyer's agent goes through the quote → authorize → execute flow, passing thecertificate_idandartifact_hashas provenance metadata. The signed receipt then permanently references both.

import requests

BASE = "https://certifieddatapayments.com"
HEADERS = {"Authorization": "Bearer cdp_live_buyer_…", "Content-Type": "application/json"}

# 1. Quote — get a price estimate and confirm the request is well-formed
quote = requests.post(f"{BASE}/api/payments/quote", headers=HEADERS, json={
    "amount_cents": 5000,    # $50.00
    "currency": "usd",
    "merchant_id": "mrc_dataset_vendor_1",
    "rail": "stripe",
    "purpose_tag": "data_access",
    "purpose": "Certified synthetic dataset purchase",
    "artifact_hash": f"sha256:{dataset_sha256}",
    "certificate_id": "cert_syn_7f83b165",
}).json()

# 2. Authorize — evaluates against the active spend policy
auth = requests.post(f"{BASE}/api/payments/authorize", headers=HEADERS, json={
    "quote_id": quote["quote_id"],
}).json()
# auth["decision"] == "authorized"

# 3. Execute — settles the payment; returns a signed receipt
result = requests.post(f"{BASE}/api/payments/execute", headers=HEADERS, json={
    "authorization_id": auth["authorization_id"],
}).json()

receipt = result["receipt"]
# receipt contains: receipt_id, artifact_hash, certificate_id, signature
Phase 3 · Anyone verifies independently

No account required. The buyer, an auditor, or a downstream system verifies the full chain with two public API calls.

# Verify the certificate (dataset is synthetic + unmodified)
curl https://certifieddata.io/api/manifest/certificate/cert_syn_7f83b165
# → { "certificationId": "cert_syn_7f83b165", "signatureValid": true,
#     "datasetHash": "sha256:abc…", "issuer": "CertifiedData.io" }

# Verify the payment receipt (spend was policy-approved for this cert)
curl https://certifieddatapayments.com/api/payments/verify/rcpt_01jbm…
# → { "verified": true, "hashValid": true, "signatureValid": true,
#     "receipt": { "certificate_id": "cert_syn_7f83b165", "artifact_hash": "sha256:abc…" } }

# Cross-check: sha256sum the file you downloaded
sha256sum synthetic_users_10k.parquet
# → must match receipt.artifact_hash and certificate.datasetHash

Why this matters for regulated buyers

Current approachProblemCertifiedData approach
Stripe/Gumroad for datasetsBuyer gets a file. No proof it's synthetic, certified, or unchanged.Receipt references certificate_id + artifact_hash. Compliance-ready.
HuggingFace / open downloadTrust the platform's curation. No per-dataset payment proof.Every download is a paid capture with a signed receipt and public cert.
Internal logs + invoicesOnly you can verify. Auditors need your infrastructure.Two public URLs prove dataset integrity and payment. No trust needed.

Build a certified synthetic data supply chain

Certify your datasets, expose paid download endpoints, and give every buyer a receipt that proves what they paid for. See the full batch-certify + commerce pattern.

Certified Synthetic Dataset Purchases — Payment + Certification in One Record | CertifiedData