Hashing and digital signatures are often conflated but serve distinct roles in AI certification systems. SHA-256 hashing proves that a dataset has not changed. Ed25519 signatures prove that a certificate was issued by the claimed authority and has not been modified.
Both mechanisms are required for complete verification. A hash alone cannot prove who certified the dataset. A signature alone cannot prove the dataset is intact.
What hashing proves
A cryptographic hash function like SHA-256 transforms a dataset into a fixed-length fingerprint. The key property is determinism: the same input always produces the same output, and any change to the input — even a single byte — produces a completely different output.
In AI certification, the dataset hash is computed at generation time and embedded in the certificate. At verification time, the hash is recomputed and compared. A match proves the dataset is byte-for-byte identical to the certified version.
- Proves: the dataset has not been modified since certification
- Does not prove: who certified the dataset
- Does not prove: the certificate is legitimate
- Use case: dataset integrity verification
What signatures prove
A digital signature is produced by applying a private key to a message — in this case, the certificate payload. The signature can be verified by anyone with the corresponding public key, confirming that the message was produced by the holder of the private key and has not been altered since signing.
In AI certification, the Ed25519 signature covers the entire certificate payload. A valid signature proves the certificate was issued by CertifiedData and that none of its fields have been modified.
- Proves: the certificate was issued by CertifiedData
- Proves: the certificate has not been modified since issuance
- Does not prove: the dataset is intact
- Use case: certificate authenticity verification
Why both are required
A hash without a signature can be fabricated. Anyone could create a dataset, compute its hash, and forge a certificate claiming the dataset was certified by CertifiedData. Without a signature, there is no way to verify the certificate is genuine.
A signature without a hash proves the certificate is authentic but says nothing about the current state of the dataset. The dataset could have been modified after certification without affecting the certificate signature.
Together, the hash and signature create a complete verification chain: the dataset is intact (hash match) and the certificate is legitimate (signature valid). Both checks must pass.
The complete verification model
In CertifiedData's system: the dataset is hashed with SHA-256, the hash is embedded in the certificate payload, the payload is signed with Ed25519, and the public key is published for independent verification.
Verification requires two independent checks: recompute the dataset hash and compare to the certificate (integrity), then verify the certificate signature against the public key (authenticity). A certificate that passes both checks provides complete, independently verifiable proof of AI artifact provenance.
Frequently asked questions
Can a hash replace a signature?
No. A hash proves a dataset has not changed but cannot prove who certified it. Signatures are required to prove certificate authenticity and issuer identity.
Can a signature replace a hash?
No. A signature proves the certificate is authentic but does not confirm the current state of the dataset. The dataset hash is required to detect post-certification modification.
What if only one check passes?
If the hash matches but the signature fails, the certificate is potentially forged or tampered. If the signature is valid but the hash does not match, the dataset has been modified since certification. Both checks must pass for the artifact to be considered valid.
See both mechanisms in action
CertifiedData certificates include both the SHA-256 dataset hash and the Ed25519 signature — independently verifiable by any party.