dpdap — receipts for differential privacy

The 60‑second version

You read in the news that “1 in 5 people in your county has diabetes.” That number probably came from a public release. The agency that published it likely added a tiny amount of statistical noise — the true count might have been 18,213 cases and the published number 18,247 — so no individual person could be singled out from the data. That’s differential privacy.

Today, you have no way to check whether the protection was applied correctly, or how strong it was. The institution publishes a number; you trust them.

dpdap attaches a small, signed “privacy receipt” to every release — recording who published it, what protection was used, and how strong it was. Anyone can verify the signature in seconds; for releases that publish a fresh draw every reporting interval — streaming telemetry, daily-refresh statistics — the receipt’s noise claim can also be tested against the data. dpdap consumes the OpenDP Deployment Registry schema, the public-facing index of real DP deployments maintained by Harvard’s OpenDP project — whose open-source library backs the IRS, Wikimedia Foundation, and UNHCR releases — and aligns with the federal evaluation guidance in NIST SP 800-226.^[1]

What changes, in one picture

Today

With dpdap

Differential privacy in 90 seconds

Differential privacy was formalized in 2006 by Cynthia Dwork and collaborators.^[2] The idea is direct: when you publish a statistic, add a small amount of carefully calibrated random noise to it. The noise comes from a known distribution — typically Laplace or Gaussian, bell curves centered on zero — so the published count might be a few units higher or lower than the true count, with the size of the wobble pinned to a known number. Done right, the noise is enough to hide any one person’s contribution while leaving the overall pattern intact.

The strength of the protection is measured by a small Greek letter, ε (epsilon). Smaller ε means more noise and stronger privacy; larger ε means less noise and weaker privacy. It is called a privacy budget because every query an institution publishes from a dataset spends some of it; once the total is exhausted the institution must stop publishing or accept that further releases erode the guarantee. There is no universal “right” value, but the literature has rough reference points: academic work that calls a release “strongly private” usually means ε ≤ 1; mainstream production deployments sit somewhere between ε ≈ 1 and ε ≈ 10; the 2020 U.S. Census release was published at the high end at ε ≈ 19.61, which several academics criticized as too weak.^[3] Apple’s on-device keyboard analytics report per-event ε in a roughly comparable range, with the per-day budget rolled up daily.^[4]

Differential privacy is rigorous in a way that older approaches — like “remove the names and date of birth” — are not. It comes with a mathematical proof, not a hopeful intuition. That proof is the thing dpdap is built to let you verify.

How dpdap actually checks the math

The premise is straightforward. An institution’s release arrives accompanied by a signed receipt that says, for example, “I added Laplace noise calibrated to ε = 1.0.” dpdap’s job is to check whether the noise actually applied matches that claim.

It does this by running the underlying release process many independent times (in synthetic mode, against either the real aggregator or a controlled simulation), collecting the noise samples, and running standard statistical tests:

Mean check. Is the noise centered around zero, as the claimed mechanism predicts? Catches gross bias — a one-sided RNG, a missing sign flip.
Variance check. Is the noise spread the right amount? Catches scale errors — an off-by-two sensitivity bound, the wrong epsilon plugged into the formula.
Distribution-shape check (Kolmogorov–Smirnov). Does the shape of the noise distribution match the claim? This is the strongest of the three; it catches “wrong distribution” failures — for instance, an aggregator that draws from a Gaussian while advertising Laplace.
Empirical-ε estimate. Using the noise samples themselves, dpdap fits the noise scale by maximum likelihood and reports an estimated ε with a 95% confidence interval. If that interval sits well above the claimed ε, the aggregator is adding less noise than disclosed.

Every probe run also publishes a sample-adequacy note: the smallest deviation from the claimed noise scale the run could have detected with 80% power at this sample size and significance level α = 0.05. If the test could only have caught a 2× under-noising and the data came back consistent with the claim, the report says so — no false confidence about what the data resolution actually allows.

What the probe tests is a parametric claim: “this output is a draw from a known noise distribution at a stated scale.” The probe does not certify that the underlying mechanism is differentially private. It tests whether the data looks like the declaration. Anything more would be an overstatement of what a black-box check can do.

The probe never reports “Pass.” It reports Inconclusive (the data is consistent with the claim), Failed (the data is not), or Skipped (an output is declared as a public invariant, or a post-processing step has altered the noise distribution so the tests don’t apply). This is not pedantic. A black-box empirical probe can falsify weak DP claims from the output side; on its own, it cannot prove a good claim. Cryptographic verification — zero-knowledge proofs of correct mechanism application (Biswas, Dong, et al., PETS 2025) — can deliver a stronger formal guarantee at substantial computational cost. Empirical conformance and cryptographic verification are complementary: an empirical probe is cheap and runs against any pipeline that publishes repeatedly; a ZK proof is heavyweight and requires the publisher to ship the circuit and a prover.

In practice, the probe is naturally suited to streaming aggregates and telemetry — daily-refresh sketches, and in particular DAP pipelines: the Distributed Aggregation Protocol, an IETF-track system Mozilla, Cloudflare, and Let’s Encrypt’s parent organization ISRG use to gather usage metrics by splitting each device’s contribution across two non-colluding aggregator servers, which then combine and noise the result before releasing it.^[6] Anything that publishes a fresh draw every reporting interval is in the probe’s natural territory. For one-shot statistical releases such as a decennial census file, only the signed receipt applies; the probe needs samples the release does not provide.

The probe in action

400 batches of synthetic reports run through dpdap’s probe — the same routine an auditor would run against a live release.

The aggregator does what it claims: residuals track the Laplace curve, the empirical-ε estimate brackets the claim. The probe returns Inconclusive — the data is consistent with the claim.

The aggregator claims Laplace noise and adds none at all: every residual lands on zero, KS distance pegs at 0.5. The probe returns Failed — the claim is falsified.

Why now

This stopped being academic a while ago. In the last two years:

The Census Bureau built its 2020 release around differential privacy and committed to using it for the next decade.
Wikipedia, the IRS, and the UN’s refugee agency are doing the same with the open-source OpenDP toolkit.
New EU and US rules reward privacy-preserving data sharing without naming DP outright. The EU Data Act (December 2023) requires “appropriate technical and organisational measures” for shared data; the EU AI Act (August 2024) imposes data quality and provenance obligations on high-risk systems; the US executive order on bulk sensitive personal data (EO 14117, March 2024) governs cross-border transfers. Differential privacy is the most rigorous answer the field currently offers to those obligations; a verifiable receipt makes that answer inspectable.
Generative AI has made data leakage — the property of a system letting an outsider recover details about specific people in its training data, by querying the model — concrete and easy to demonstrate. Differential privacy is one of the few defenses that comes with a quantitative bound on how much a curious adversary can learn.

The math is being deployed. The receipts are missing.

The TLS analogy

In 1995, every website was http://. There was no lock icon. You had no way to know if your bank’s login page was actually your bank’s login page. People knew the encryption math worked — but it was invisible at the boundary where a normal person made a decision.

Today nearly every site you visit shows you a small lock. You probably haven’t thought about it in years. That happened because TLS got standardized at the IETF, the Internet Security Research Group’s Let’s Encrypt made certificates free and automated, and browsers wired the result into the address bar. Ten years ago, fewer than 39% of page loads were encrypted; today, in most of the world, the number is close to 100%.^[7] The math, the protocol, and the user-visible signal all had to ship together — and it took a decade of compounding standards work.

Differential privacy is in 1995’s position. The math is solved, the deployments are real, and the user-visible signal a non-specialist can check is missing. The same ingredients are already on the table: Harvard’s OpenDP is the library and registry; ISRG’s Divvi Up and Cloudflare are running the DAP protocol; NIST has published the evaluation framework. dpdap is building one piece of the user-visible signal: a signed receipt anyone can verify, and a probe that tests the receipt’s claim against the data when there are samples to test.

The vision

For publishers

A one-line addition to an existing publishing workflow that attaches a signed receipt to a release.

No pipeline rewrite. No new infrastructure.

For everyone else

A one-line check (or a click in your browser) that validates a published statistic against its receipt.

Same check, in a web page, for non-technical readers.

Who this is for

Civic agencies publishing population, health, transit, or education data who want to demonstrate — not just claim — that their releases protect individuals.
Journalists and researchers citing official statistics, who want to know the privacy strength behind the numbers they print.
Privacy regulators who currently have no machine-readable artifact to audit.
Software developers building usage analytics or industry benchmarks who want to count things across a population without storing per-user records they’d later have to defend — and want a number their users can verify rather than a privacy policy users have to take on faith.
Citizens who’d like to actually check, the same way you check the lock icon on a banking site.

Where this fits in the broader DP world

Differential privacy is in production. The U.S. Census Bureau built its 2020 release on it and committed to using it for the next decade. The Wikimedia Foundation publishes reader analytics under it. The IRS uses it for Statistics of Income releases. The UN refugee agency is piloting it on microdata. Apple, Google, Mozilla, and Cloudflare all use it for usage telemetry — the measurements an app or browser sends back to its maker about how it is being used. The public-sector deployments build on the open-source OpenDP toolkit (which absorbed Tumult Analytics and Tumult Core in October 2025).^[5] The platform telemetry deployments split: Mozilla Firefox runs through ISRG’s Divvi Up, Cloudflare runs its own DAP implementation (Daphne); Apple and Google maintain their own internal DP stacks. The mathematics layer is solved.

Several active research and standards efforts are converging on the missing disclosure layer:

The OpenDP Deployment Registry (registry.opendp.org), launched September 2025, is the first live machine-readable index of real DP deployments. Its three-tier schema (Nanayakkara, Ghazi, Vadhan, arXiv 2509.13509) is what dpdap consumes. The paper explicitly flags §7.1 that publication metadata alone cannot verify a privacy claim — the conformance check has to come from somewhere else. The probe is what comes from somewhere else.
A 2026 PETS paper, “We Need a Standard”: Toward an Expert-Informed Privacy Label for Differential Privacy (Dibia, Lu, Bhattacharjee, Near, Feng), proposes a human-readable privacy label aimed at end-user comprehension, with a working demo at privacylabel4dp.github.io.
The IETF Privacy Preserving Measurement working group is standardizing DAP itself (draft-ietf-ppm-dap) and a proposed extension that binds a DAP release to its claimed privacy budget (draft-thomson-ppm-dap-dp-ext).
Recent academic work (PETS 2025) uses zero-knowledge proofs to make DP application cryptographically verifiable — heavyweight but rigorous, and complementary to empirical conformance probing.

dpdap is the signature, verification, and empirical conformance layer on top of these. The receipt format borrows from the disclosure-label work. The probe makes those claims testable. A consumer-side verifier — eventually compiled to WebAssembly — makes the testing accessible to anyone with a browser.

Under the hood

dpdap is a Rust workspace of seven crates with 141 tests and continuous integration on Linux and macOS. Licensed Apache-2.0.

Conformance probe — KS test, mean z-test, variance χ², empirical-ε MLE with Wald CI for Laplace mechanisms, Gaussian σ estimator. A JanusAdapter speaks draft‑ietf‑ppm‑dap‑17 with HPKE and Prio3Sum, enabling probing against real DAP deployments over HTTP.
Receipt layer — Ed25519 signing and verification of OpenDP Deployment Registry disclosures (arXiv 2509.13509), with a second adapter for the Dibia privacy label (PETS 2026). CBOR and JSON encodings. Verifies in the CLI and in any browser via WebAssembly.
Python binding — PyO3/maturin wheel (abi3‑py39): probe_mock(), verify_receipt(), generate_keypair(), sign_receipt().

Four draft IETF issue write-ups — underspecified receipt format, absent conformance-test guidance, budget-binding ambiguity in draft‑thomson‑ppm‑dap‑dp‑ext, and a proposed audit-endpoint extension to the OpenDP Deployment Registry schema — are ready to file with working test cases attached.

Sibling project: modelreceipt

dpdap is about differentially-private aggregate releases: public statistics, DAP measurements, census-like tables, and other scalar or tabular outputs where the receipt names a noise mechanism and the probe can test the distribution of repeated releases.

modelreceipt carries the same public-verifiability idea to DP model releases: DP-SGD trained LLMs, synthetic datasets sampled from those models, and downstream models trained on the synthetic data. The core relationship is the same — signed receipts plus empirical probes — but the technical surface is different: privacy units, accounting assumptions, model artifacts, canary audits, extraction probes, and synthetic-data composition.

Get in touch

If you cover privacy, AI, or civic technology — or you work on a release pipeline that publishes differentially private statistics and have an opinion about what a verifiable receipt should look like — I’d like to talk. The differential-privacy era is here, the field is still running on faith, and the design choices being made now will outlast a lot of louder news cycles.

jamesdreben@gmail.com

Notes

Joseph Near, David Darais, Naomi Lefkovitz, and Gary Howarth, Guidelines for Evaluating Differential Privacy Guarantees, NIST SP 800-226, March 2025. §2.7 describes auditing for data release systems as “running the algorithm being tested many times to determine if the distribution of results satisfy the differential privacy definition” — the operating principle of dpdap’s probe. ↩
Cynthia Dwork, Differential Privacy, ICALP 2006. The companion paper introducing the Laplace mechanism is Dwork, McSherry, Nissim, and Smith, Calibrating Noise to Sensitivity in Private Data Analysis, TCC 2006. ↩
The privacy-loss budget for the 2020 Decennial Census is documented by the U.S. Census Bureau’s Disclosure Avoidance publications. For the academic critique that ε ≈ 19.61 is too weak, see Hotz et al., Balancing data privacy and usability in the federal statistical system, PNAS 2022. ↩
Apple, Differential Privacy Overview. Per-event ε varies by feature (single-digit values for emoji and keyboard analytics) with a per-day budget rolled up daily. ↩
OpenDP, Welcoming Tumult Analytics and Tumult Core to OpenDP, October 2025. The open-source projects (Tumult Analytics and Tumult Core) joined OpenDP; the Tumult Labs commercial team announced separately that it was joining LinkedIn. ↩
DAP is documented in draft‑ietf‑ppm‑dap, currently at version 18. Cloudflare Research published a non-specialist deep dive; Divvi Up — the ISRG-operated DAP service used by Mozilla Firefox — recounts the standardization history. Reference implementations: ISRG’s Janus and Cloudflare’s Daphne. DAP is a secret-sharing protocol; the differential-privacy layer (Laplace, Gaussian, or discrete-Gaussian noise on the aggregate) is applied on top, and is what the probe tests. ↩
Encryption-coverage figures from the 2025 ISRG Annual Report: 39% of page loads encrypted in 2015, close to 100% in most regions by 2025; over 700 million domains served by Let’s Encrypt as of October 2025. ↩

About me

I am James Dreben. I studied computer science and machine learning at Harvard. In 2017, my senior year, I took Cynthia Dwork’s graduate seminar on cryptography and privacy, and wrote a final paper on a public-data version of differentially private mobility modeling. I’ve spent the years since as a software engineer across AI, site reliability, and full-stack web work. I am re-engaging with differential privacy after about a decade away from the field, so a lot of this work is me catching up on what shipped while I was busy elsewhere.

The framing matters more to me after the generative-AI boom than it did when I first encountered it. Powerful data-driven systems are now routine, and the default deployment story is uncomfortable: ship the system as a black box and accept that it will quietly leak details about the people in its training data, or refuse to ship it and forgo the capability. Differential privacy is one of the few defenses that comes with a quantitative bound on what an outsider can learn from a release. dpdap is an attempt to make that bound checkable outside the building, rather than something the publisher asks you to take on faith. That is the whole project.

Make privacy promises you can actually check.