Goal: Develop a system to enable cross-network discovery of potential collaborators and data sources and querying of those sources while enforcing each network's governance rules

The growing adoption of distributed health data networks to facilitate large-scale evidence generation studies (e.g., comparative safety and effectiveness), as well as other public health activities, provides an opportunity to leverage those investments to create a national resource that enables a true Learning Health System. FDA, PCORI, NIH, ONC, CDC and others are supporting various forms of distributed health data networks. Together, these networking infrastructure investments can be integrated to support needs across funding agencies and the broader public health community.

The PopMedNet software application currently enables creation, operation, and governance of distributed health data networks. It supports distributed within-network querying for Sentinel, PCORnet, MDPHnet, HCSRN, HMO Cancer Research Network, and NIH Health Care Systems Research Collaboratory.

CNDS extends PMN’s existing functionality to enable cross-network discovery of potential collaborators and data sources and querying of those sources while enforcing governance rules.

Goal: Implement a method within the existing PopMedNet open-source platform to enable automated, iterative, privacy-protecting, and scientifically accurate distributed regression analyses in a real-world setting.

We developed and validated PMN’s ability to perform distributed regression analyses using only intermediate statistics and produce regression coefficient estimates that are statistically equivalent to pooled individual-level data analysis. Iterative regressions are performed locally based on starting parameter values sent from a central analysis center. The analysis center combines intermediate statistics from all local data sources to produce a new set of starting parameter values for the next set of local regression estimates. The process iterates until all local results converge to a single set of coefficient estimates. Trigger files indicate the start or end of a process, which us automated by PopMedNet. The workflow is currently being piloted in the Sentinel System, a national medical product safety surveillance system. Successful implementation of distributed regression functionality in Sentinel will likely lead to adoption of DRA in other Distributed Data Networks.

Goal: Use electronic health data across distinct healthcare institutions support the vision of Learning Health System.

Distributed data networks (DDNs) typically use a common data model (CDM) to standardize data across institutions (data partners) to facilitate rapid distributed analyses while keeping data behind local firewalls. However, data partners often have varying governance and data infrastructures that can lead to substantial ecosystem heterogeneity across partners, including use of various relational database management systems. To enable rapid distributed querying across heterogeneous technical ecosystems we developed a novel tool within the open-source PopMedNet platform that bridges these platforms and software challenges. This tool enables investigators to compose and distribute custom queries via a point-and-click user interface in PMN, referred to as Menu-Driven Query. MDQ does not require knowledge of the Structured Query Language to construct a query and automatically produces syntactically correct SQL for Oracle, Microsoft SQL Server, and PostgreSQL databases.