Guiding Principles for Human Data Sharing and Use: Balancing Public and Scientific Values

In January 2023, the new NIH Data Management and Sharing (DMS) Policy came into effect. This policy applies to all research, funded or conducted in whole or in part by NIH, that results in the generation of scientific data. Scientific data to be shared include any data needed to validate and replicate research findings and must be of sufficient quality to do so.

DMS plans for projects involving human data must describe any applicable factors affecting subsequent access, distribution, or reuse of scientific data including any informed consent or privacy/confidentiality limiters and whether access to data will be controlled.

In 2018, in response to consent form requirements updated in the revised Common Rule, many research institutions broadened their informed consent template language to more openly discuss what data sharing is, the circumstances when it may occur and the controls that may be associated with sharing to best protect participant data. However, novel scientific techniques and tools (including artificial intelligence [AI] and machine learning), new expectations related to sharing (including to develop AI-ready datasets), and the changing nature of what may be considered identifiable data necessitate a shared understanding across research institutions of best practices for sharing both consented and unconsented data.

The increasing pressure to maximally broaden sharing and use to facilitate research comes at the same time as research participants, communities, and the public are reporting discomfort with broad consent, commercial access and use, and benefits that accrue only to researchers and companies. In general, members of the public are willing and often eager to participate in research, but they want to be asked, and often articulate limits or safeguards necessary to maintain their trust. And increasingly, they want return of value to themselves and their communities.

In this context, it is even more important that we are responsive to broadly held public values, and we develop models for data governance and sharing that facilitate cutting edge research while respecting the values and interests of the people whose data make our research possible.

This consensus conference is the culmination of a project, funded by the Greenwall Foundation, designed to: conduct systematic data collection to develop a landscape of institutional responses to the 2023 NIH DMS policy; synthesize existing research about public interests, needs, and values with regard to data collection, sharing, and use; and bring together scientists, regulatory experts, bioethics scholars, Indigenous scholars, librarians, IT professionals, and community members of IRBs to develop concrete, actionable consensus guidance for balancing scientific and public values in institutional and investigator responses to the increased demand for data sharing.