OPEN RESEARCH

Building Infrastructure for the Field

Alongside our commercial product work, Waymark Lab contributes to the broader public benefits research community through open infrastructure projects designed to benefit the entire field.

EARLY CONCEPT — PENDING FUNDING

Synthetic Data Infrastructure for the Public Benefits Ecosystem

Meaningful AI research in public benefits delivery requires realistic case data — and that data is largely inaccessible to the research community. Real benefits case data is protected by federal SNAP confidentiality rules, HIPAA-aligned Medicaid privacy requirements, and state privacy laws. As a result, academic teams build AI tools on simplified datasets that fail to generalize in real agencies, and the field lacks shared benchmarks for evaluating AI tools objectively.

This project proposes to build and openly release the foundational layer of a national multi-program synthetic case file generator spanning SNAP, Medicaid, and TANF — with a public extensibility framework allowing the research community to expand coverage over time. Each generated case represents a complete synthetic household with cross-program enrollment modeled for internal consistency. All deliverables released under permissive open licenses.

Multi-Program Coverage

Native generation for SNAP, Medicaid, and TANF with cross-program household consistency. Extensible framework for community-contributed programs.

Privacy by Design

No real client records used at any stage. Calibrated entirely from public federal statistics. Formal re-identification testing and privacy attestation built into the project plan.

Fully Open

Generator software, extensibility framework, baseline dataset, fidelity methodology, and benchmark suite all released under permissive open licenses.

Why This Matters

Despite serving more than 28 million households on SNAP and over 90 million people on Medicaid, public benefits AI operates on research infrastructure that would be considered inadequate in any adjacent field. Medical AI and financial AI both have well-established public benchmarks and synthetic data initiatives. Public benefits AI does not. This project is designed to change that.

28M+
Households on SNAP nationally
90M+
People on Medicaid nationally

Collaboration

Interested in This Research?

We welcome collaboration with researchers, civic technology organizations, and government agencies working on public benefits AI.