Research Data Lakes

RIPL solutions improve outcomes for those accessing government services. 

Our secure, cloud-based approach unlocks siloed data for both evidence-based policy-making and to deliver data-driven technology solutions.  We design and develop replicable, science-based solutions across unemployment, workforce development, and education sectors. Each of these solutions are live and being used by jobseekers and students in multiple states.  


Imagine the endless benefits of breaking down data silos between state agencies.

The Research Data Lake (RDL) gives state governments the ability to organize disparate sources of data across numerous state agencies and policy domains. An RDL integrates and anonymizes data for 360-degree insights for strategic planning efforts, operational decision making, and policymaking.
By combining labor, education, health, and social services data, government is able to make informed, data-driven decisions and enact effective, cross-departmental policies.


Ensure your data are always protected.

RIPL deploys the RDL in a secure, cloud-based enclave, leveraging the security, scale, and cost-efficiency of cloud services. Its cloud infrastructure is FedRAMP-approved and designed to be FISMA, HIPAA, and FERPA compliant.

We have built in best-in-class firewalls, encryption, and regular, automatic auditing tools. The RDL is owned and managed by government, securing data in the government’s sole custody.  RDL owners control user access to their data and approve all data exports. 


Safeguard your customers’ identities in the data you collect.

The RDL hosts the data and automatically integrates it into the cloud-based environment, which creates an automated pipeline for integrating and anonymizing data.
Automatic algorithms separate Personally Identifiable Information (PII) at data ingest and generate a single global ID that joins an individual’s records across multiple datasets. The PII is then permanently shredded, blocking RDL users from viewing, accessing, or using PII.


Begin producing results at your pace.

The RDL organizes and classifies de-identified data for research at the speed of policy. By transforming the data into standardized, usable formats, you are able to begin using the data immediately.  

A codebook is generated for each dataset, containing descriptive statistics that are automatically updated and monitored for data quality. These codebooks accelerate and support consistent, robust, and reliable analysis – taking science into production for quick, meaningful policy impact. 


In Rhode Island, we developed an integrated database of administrative records from multiple agencies with over 800 tables and 2.7 billion records related to over 4 million anonymous individuals.

This data supports econometric and machine-learning research into policies with promise to deliver higher impact per dollar and better serve individuals and families.

RIPL insights


Oversee and facilitate partnerships with those who you grant access to your data.

The RDL is owned and managed by government, empowering government to partner safely and securely with other agencies, internal and external researchers, and vendors. The RDL deployment comes with policies, best practice manuals, and templates to help foster partnership. These materials include legal templates, universal governance agreement template, data-sharing and data-use template agreements, and more.



Harness the power of the RDL.

Policymakers and researchers are able to deploy policy application modules that analyze data and produce dashboards to display research insights. RIPL can implement off-the-shelf analyses to test the impact of policy interventions, as well as create custom policy applications to match your state’s particular needs. 

Government can respond in real-time to its most challenging policy issues. For example, Rhode Island’s use of the RDL to support the creation of a Pandemic Unemployment Assistance claims processing system.


Administrative data can provide new facts to guide policymakers.  However, understanding the quality of administrative records, and integrating, transforming, and optimizing them for policy insights present many challenges.  

We are experts in understanding and building databases for use by policymakers.

RIPL insights

Administrative Data

Wage Records

Employer Records


Short-Term Disability


Higher Education

K-12 Records

Workforce Training

Adult Education


Social Benefit & Insurance


Child Welfare


Building Data infrastructure

Data from multiple sources, siloed in different agencies and departments across government can be collected and synthesized in the Research Data Lake.

Ingest and


Structure Data for Insights

Research Data

Anonymized, Versioned Releases

Optimized for Research

Standardized Variables

Reproducible Results

the secure Data Enclave

Once it has been created, new data continues to flow into the Research Data Lake, providing a resource that is optimized for real-time analysis and research. 

Secure Cloud Architecture owned and managed by GOV

DATA never leave government custody

GOV has option to shred PII after ingest complete

Approved Access

Documented De-Identified Exports

“RIPL is helping us bring together agencies across Virginia with innovative solutions that leverage our data assets to improve policy now, while helping us build technology and data resources to sustain and expand data-driven policy going forward.“

Carlos Rivero

Chief Data Officer, Commonwealth of Virginia

“Through partnership, insights, and integrated data, RIPL gave our team the jump start we needed to improve how our state serves whole people, families and communities. RIPL’s ability to quickly gather, clean, connect, and curate data for analysis—and then apply machine learning in collaboration with our teams’ context—gave us a new sense of the possible for data-driven policy-making, academic partnerships, and continuous improvement. 

Because of this jump start and the credibility it helped us build through our first major integrated data project, we have now built the capacity to conduct our own integration and analysis in the future, expanding our ability to serve our communities and help families in need.”

Kimberly Paull

Director of Data and Analytics for the Rhode Island Executive Office of Health and Human Services