Research Data Lakes

Research Data Lake solutions drive measurable, improved policy impact.

We help policy makers access and use their administrative data to improve policy. Our secure, cloud-based, Research Data Lake (RDL) solution unlocks siloed data to support evidence-based policy-making and deliver data-driven technology solutions such as web and mobile applications, nudges, email and text message campaigns, and more. 

Our solutions put valuable information and public policy programs at users’ fingertips, engaging them and helping them succeed.


An RDL joins administrative data for 360-degree insights.

The RIPL RDL gives our partners the ability to securely store, organize, and combine disparate data sources from numerous agencies, departments, and third-party data sources. 

Our software integrates and anonymizes data for 360-degree insights to guide strategic planning efforts, drive operational decision making, and make policy measurably effective. By combining data across domains, such as workforce, education, health, and social services, our partners can make informed decisions and implement policies that alleviate poverty and increase economic opportunity.

Learn more about how RIPL builds these insights to solve pressing policy challenges here.


RDLs are built on secure cloud technology.

We have built in best-in-class firewalls, encryption, and regular, automatic auditing tools. The RDL is owned and managed by the government partner. Our partners own their account platform, control user access to their data, and approve any data exports. They have full control over and transparency into how their data are used for public policy good.

Learn how the RIPL-developed RDL and secure cloud-based technology applications, such as the Rhode Island COVID-19 Pandemic Unemployment Assistance application, can be delivered quickly and efficiently to improve lives here.


Data are anonymized for research insights.

Sensitive data are automatically anonymized. Algorithms separate Personally Identifiable Information at data ingest and generate a single global ID that joins an individual’s records across multiple datasets. The PII is then permanently shredded, allowing approved users to analyze data for important insights to improve public policy while preserving confidentiality.

Learn more about the RIPL-developed anonymization algorithms and code pipeline here.


Automate results, build analytic capacity, and manage data quality.

The RDL organizes and classifies de-identified data to support rapid and reliable policy insights. By transforming the data into standardized, usable formats, our partners are able to begin using their data to gain insights and guide decisions immediately.  

Codebooks are generated automatically. They contain feature definitions and descriptive statistics, and are automatically updated and monitored for data quality. These codebooks accelerate and support consistent, robust, and reliable analysis – taking science into production for quick, meaningful policy impact.

Learn more about RIPL’s automated codebooks and derived tables here.

double quotes - L

“RIPL is helping us bring together agencies across Virginia with innovative solutions that leverage our data assets to improve policy now, while helping us build technology and data resources to sustain and expand data-driven policy going forward.“

Carlos Rivero

Chief Data Officer, Commonwealth of Virginia

double quotes - R

In Rhode Island, we developed an integrated database of administrative records from multiple agencies with over 800 tables and 2.7 billion records related to over 4 million anonymous individuals.

This data supports econometric and machine-learning research into policies with promise to deliver higher impact per dollar and better serve individuals and families.

RIPL insights


Oversee and facilitate partnerships with those who you grant access to your data.

The RDL is owned and managed by government, empowering government to partner safely and securely with other agencies, internal and external researchers, and vendors. The RDL deployment comes with policies, best practice manuals, and templates to help foster partnership. These materials include legal templates, universal governance agreement template, data-sharing and data-use template agreements, and more.



Harness the power of the RDL.

Policymakers and researchers are able to deploy policy application modules that analyze data and produce dashboards to display research insights. RIPL can implement off-the-shelf analyses to test the impact of policy interventions, as well as create custom policy applications to match your state’s particular needs. 

Government can respond in real-time to its most challenging policy issues. For example, Rhode Island’s use of the RDL to support the creation of a Pandemic Unemployment Assistance claims processing system.

Draft_RDL Graphic_3.2 (1)

Administrative data can provide new facts to guide policymakers.  However, understanding the quality of administrative records, and integrating, transforming, and optimizing them for policy insights present many challenges.  

We are experts in understanding and building databases for use by policymakers.

RIPL insights

double quotes - L

Through partnership, insights, and integrated data, RIPL gave our team the jump start we needed to improve how our state serves whole people, families and communities. RIPL’s ability to quickly gather, clean, connect, and curate data for analysis—and then apply machine learning in collaboration with our teams’ context—gave us a new sense of the possible for data-driven policy-making, academic partnerships, and continuous improvement. 

Because of this jump start and the credibility it helped us build through our first major integrated data project, we have now built the capacity to conduct our own integration and analysis in the future, expanding our ability to serve our communities and help families in need.

Kimberly Paull

Director of Data and Analytics for the Rhode Island Executive Office of Health and Human Services

double quotes - R