dpdap — receipts for differential privacy

The 60‑second version

You read in the news that “1 in 5 people in your county has diabetes.” That number probably came from a public release. The agency that published it likely added a tiny amount of statistical noise — the true count might have been 18,213 cases and the published number 18,247 — so no individual person could be singled out from the data. That’s differential privacy.

Today, you have no way to check whether the protection was applied correctly, or how strong it was. The institution publishes a number; you trust them.

dpdap attaches a small, signed “privacy receipt” to every release — recording who published it, what protection was used, and how strong it was. Anyone can verify the signature in seconds; for releases that publish a fresh draw every reporting interval — streaming telemetry, daily-refresh statistics — the receipt’s noise claim can also be tested against the data. dpdap reads disclosures from the OpenDP Deployment Registry schema — the open-source library and registry maintained by Harvard’s OpenDP project and used in production by the IRS, the Wikimedia Foundation, and UNHCR — and aligns with the federal evaluation guidance in NIST SP 800-226.^[1]

What changes, in one picture

Today

With dpdap

Differential privacy in 90 seconds

Differential privacy was formalized in 2006 by Cynthia Dwork and collaborators.^[2] The idea is direct: when you publish a statistic, add a small amount of carefully calibrated random noise to it. The noise comes from a known distribution — typically Laplace or Gaussian, bell curves centered on zero — so the published count might be a few units higher or lower than the true count, with the size of the wobble pinned to a known number. Done right, the noise is enough to hide any one person’s contribution while leaving the overall pattern intact.

The strength of the protection is measured by a small Greek letter, ε (epsilon). Smaller ε means more noise and stronger privacy; larger ε means less noise and weaker privacy. It is called a privacy budget because every query an institution publishes from a dataset spends some of it; once the total is exhausted the institution must stop publishing or accept that further releases erode the guarantee. There is no universal “right” value, but the literature has rough reference points: academic work that calls a release “strongly private” usually means ε ≤ 1; mainstream production deployments sit somewhere between ε ≈ 1 and ε ≈ 10; the 2020 U.S. Census release was published at the high end at ε ≈ 19.61, which several academics criticized as too weak.^[3] Apple’s on-device keyboard analytics report per-event ε in a roughly comparable range, with the per-day budget rolled up daily.^[4]

Differential privacy is rigorous in a way that older approaches — like “remove the names and date of birth” — are not. It comes with a mathematical proof, not a hopeful intuition. That proof is the thing dpdap is built to let you verify.

How dpdap actually checks the math

The premise is straightforward. An institution’s release arrives accompanied by a signed receipt that says, for example, “I added Laplace noise calibrated to ε = 1.0.” dpdap’s job is to check whether the noise actually applied matches that claim.

It does this by running the underlying release process many independent times (in synthetic mode, against either the real aggregator or a controlled simulation), collecting the noise samples, and running standard statistical tests:

Mean check. Is the noise centered around zero, as the claimed mechanism predicts? Catches gross bias — a one-sided RNG, a missing sign flip.
Variance check. Is the noise spread the right amount? Catches scale errors — an off-by-two sensitivity bound, the wrong epsilon plugged into the formula.
Distribution-shape check (Kolmogorov–Smirnov). Does the shape of the noise distribution match the claim? This is the strongest of the three; it catches “wrong distribution” failures — for instance, an aggregator that draws from a Gaussian while advertising Laplace.
Empirical-ε estimate. Using the noise samples themselves, dpdap fits the noise scale by maximum likelihood and reports an estimated ε with a 95% confidence interval. If that interval sits well above the claimed ε, the aggregator is adding less noise than disclosed.

Every probe run also publishes a sample-adequacy note: the smallest noise-scale drift the run could have caught with 80% power at this sample size. If the test could only have caught a 2× under-noising and the data came back consistent with the claim, the report says so — no false confidence about what the data resolution actually allows.

What the probe tests is a parametric claim: “this output is a draw from a known noise distribution at a stated scale.” The probe does not certify that the underlying mechanism is differentially private. It tests whether the data looks like the declaration. Anything more would be an overstatement of what a black-box check can do.

The probe never reports “Pass.” It reports Inconclusive (the data is consistent with the claim), Failed (the data is not), or Skipped (an output is declared as a public invariant, or a post-processing step has altered the noise distribution so the tests don’t apply). This is not pedantic: a probe can falsify a privacy claim, but cannot prove one. Anyone telling you their tool proves DP correctness is selling you something.

In practice, the probe is naturally suited to streaming aggregates and telemetry — DAP/Prio3 pipelines, daily-refresh sketches, anything that publishes a fresh draw every reporting interval. For one-shot statistical releases such as a decennial census file, only the signed receipt applies; the probe needs samples the release does not provide.

The probe in action

400 batches of synthetic reports run through dpdap’s probe — the same routine an auditor would run against a live release.

The aggregator does what it claims: residuals track the Laplace curve, the empirical-ε estimate brackets the claim. The probe returns Inconclusive — the data is consistent with the claim.

The aggregator claims Laplace noise and adds none at all: every residual lands on zero, KS distance pegs at 0.5. The probe returns Failed — the claim is falsified.

Why now

This stopped being academic a while ago. In the last two years:

The Census Bureau built its 2020 release around differential privacy and committed to using it for the next decade.
Wikipedia, the IRS, and the UN’s refugee agency are doing the same with the open-source OpenDP toolkit.
The EU and US have rolled out new rules that either reward or require provable privacy in data sharing — the EU Data Act (in force December 2023), the EU AI Act (in force August 2024), and the United States executive order on bulk sensitive personal data (EO 14117, March 2024).
Generative AI has made data leakage — the property of a system letting an outsider recover details about specific people in its training data, by querying the model — concrete and easy to demonstrate. Differential privacy is one of the few defenses that comes with a quantitative bound on how much a curious adversary can learn.

The math is being deployed. The receipts are missing.

The TLS analogy

In 1995, every website was http://. There was no lock icon. You had no way to know if your bank’s login page was actually your bank’s login page. People knew the encryption math worked — but it was invisible at the boundary where a normal person made a decision.

Today every site you visit shows you a small lock. You probably haven’t thought about it in years. That happened because TLS got standardized at the IETF, certificate authorities like Let’s Encrypt made the certificates free, and browsers wired the result into the address bar. The math, the protocol, and the user-visible signal all had to ship together.

Differential privacy is in 1995’s position. dpdap is the lock icon.

The vision

For publishers

A one-line addition to an existing publishing workflow that attaches a signed receipt to a release.

No pipeline rewrite. No new infrastructure.

For everyone else

A one-line check (or a click in your browser) that validates a published statistic against its receipt.

Same check, in a web page, for non-technical readers.

Who this is for

Civic agencies publishing population, health, transit, or education data who want to demonstrate — not just claim — that their releases protect individuals.
Journalists and researchers citing official statistics, who want to know the privacy strength behind the numbers they print.
Privacy regulators who currently have no machine-readable artifact to audit.
Software developers building usage analytics or industry benchmarks who want to count things across a population without storing per-user records they’d later have to defend — and want a number their users can verify rather than a privacy policy users have to take on faith.
Citizens who’d like to actually check, the same way you check the lock icon on a banking site.

Where this fits in the broader DP world

Differential privacy is in production. The U.S. Census Bureau built its 2020 release on it and committed to using it for the next decade. The Wikimedia Foundation publishes reader analytics under it. The IRS uses it for Statistics of Income releases. The UN refugee agency is piloting it on microdata. Apple, Google, Mozilla, and Cloudflare all use it for usage telemetry — the measurements an app or browser sends back to its maker about how it is being used. Most of these institutions build on the open-source OpenDP toolkit, which absorbed Tumult Analytics and Tumult Core in October 2025.^[5] The mathematics layer is solved.

Several active research and standards efforts are converging on the missing disclosure layer:

The OpenDP Deployment Registry (registry.opendp.org), launched September 2025, is the first live machine-readable index of real DP deployments. Its three-tier schema (Nanayakkara, Ghazi, Vadhan, arXiv 2509.13509) is what dpdap consumes. The paper explicitly flags §7.1 that publication metadata alone cannot verify a privacy claim — the conformance check has to come from somewhere else. The probe is what comes from somewhere else.
A 2026 PETS paper, “We Need a Standard”: Toward an Expert-Informed Privacy Label for Differential Privacy (Dibia, Lu, Bhattacharjee, Near, Feng), proposes a human-readable privacy label aimed at end-user comprehension, with a working demo at privacylabel4dp.github.io.
The IETF Privacy Preserving Measurement working group is standardizing protocol-level DP aggregation in DAP, including extensions for binding releases to budget expenditure (draft-thomson-ppm-dap-dp-ext).
Recent academic work (PETS 2025) uses zero-knowledge proofs to make DP application cryptographically verifiable — heavyweight but rigorous, and complementary to empirical conformance probing.

dpdap is the signature, verification, and empirical conformance layer on top of these. The receipt format borrows from the disclosure-label work. The probe makes those claims testable. A consumer-side verifier — eventually compiled to WebAssembly — makes the testing accessible to anyone with a browser.

Under the hood

dpdap v0.2 is a Rust workspace of seven crates with 122 tests and continuous integration on Linux and macOS. Licensed Apache-2.0.

Conformance probe — KS test, mean z-test, variance χ², empirical-ε MLE with Wald CI for Laplace mechanisms, Gaussian σ estimator. A JanusAdapter speaks draft‑ietf‑ppm‑dap‑17 with HPKE and Prio3Sum, enabling probing against real DAP deployments over HTTP.
Receipt layer — Ed25519 signing and verification of OpenDP Deployment Registry disclosures (arXiv 2509.13509), with a second adapter for the Dibia privacy label (PETS 2026). CBOR and JSON encodings. Verifies in the CLI and in any browser via WebAssembly.
Python binding — PyO3/maturin wheel (abi3‑py39): probe_mock(), verify_receipt(), generate_keypair(), sign_receipt().

Three draft IETF issue write-ups — underspecified receipt format, absent conformance-test guidance, budget-binding ambiguity in draft‑thomson‑ppm‑dap‑dp‑ext — are ready to file with working test cases attached.

Sibling project: modelreceipt

dpdap is about differentially-private aggregate releases: public statistics, DAP-style measurements, census-like tables, and other scalar or tabular outputs where the receipt names a noise mechanism and the probe can test the distribution of repeated releases.

modelreceipt carries the same public-verifiability idea to DP model releases: DP-SGD trained LLMs, synthetic datasets sampled from those models, and downstream models trained on the synthetic data. The core relationship is the same — signed receipts plus empirical probes — but the technical surface is different: privacy units, accounting assumptions, model artifacts, canary audits, extraction probes, and synthetic-data composition.

Get in touch

If you cover privacy, AI, or civic technology — or you work on a release pipeline that publishes differentially private statistics and have an opinion about what a verifiable receipt should look like — I’d like to talk. The differential-privacy era is here, the field is still running on faith, and the design choices in front of us will outlast a lot of louder news cycles.

jamesdreben@gmail.com

Notes

Joseph Near, David Darais, Naomi Lefkovitz, and Gary Howarth, Guidelines for Evaluating Differential Privacy Guarantees, NIST SP 800-226, March 2025. §2.7 describes auditing for data release systems as “running the algorithm being tested many times to determine if the distribution of results satisfy the differential privacy definition” — the operating principle of dpdap’s probe. ↩
Cynthia Dwork, Differential Privacy, ICALP 2006. The companion paper introducing the Laplace mechanism is Dwork, McSherry, Nissim, and Smith, Calibrating Noise to Sensitivity in Private Data Analysis, TCC 2006. ↩
The privacy-loss budget for the 2020 Decennial Census is documented by the U.S. Census Bureau’s Disclosure Avoidance publications. For the academic critique that ε ≈ 19.61 is too weak, see Hotz et al., Balancing data privacy and usability in the federal statistical system, PNAS 2022. ↩
Apple, Differential Privacy Overview. Per-event ε varies by feature (single-digit values for emoji and keyboard analytics) with a per-day budget rolled up daily. ↩
OpenDP, Welcoming Tumult Analytics and Tumult Core to OpenDP, October 2025. The open-source projects (Tumult Analytics and Tumult Core) joined OpenDP; the Tumult Labs commercial team announced separately that it was joining LinkedIn. ↩

About me

I am James Dreben. I studied computer science and machine learning at Harvard. In 2017, my senior year, I took Professor Cynthia Dwork’s graduate seminar in cryptography and privacy — she co-invented differential privacy — and I wrote my final paper on a public-data version of differentially private mobility modeling. I’ve spent the years since as a software engineer across AI, site reliability, and full-stack web work.

The framing matters more after the generative-AI boom than it did when I first encountered it. Powerful data-driven systems are now routine, and the realistic options for deploying them have collapsed to two unattractive ones: ship the system as a black box and accept that it will quietly leak details about the people in its training data, or refuse to ship it and forgo the capability. Differential privacy is the third option. Receipts are what make the third option legible to everyone outside the building.

Make privacy promises you can actually check.