Researchers may need to share data collected in the field at different stages of a project’s lifecycle and in different ways, including: informal sharing of datasets among a closed network within an organization; sharing with partners and collaborators from other organizations; and formal sharing, for example, by signing data use agreements or by publishing datasets from the recognized open access repository.
Given increasingly strict regulations on data protection and privacy, and penalties for violating them, we discourage informal data sharing. Researchers should exercise caution when sharing data, especially when they contain identifying or sensitive personal information. We recommend the following best practices for sharing data.
1. Sharing openly
Publishing datasets with accompanying metadata and documentation through the repository is the most ideal method of data sharing. Sharing data through the repository allows researchers to set access conditions, terms of use, and citation requirements for data users. Different levels of open licenses can be adopted for the datasets. IFPRI uses Creative Commons Attribution 4.0 International (CC-BY 4.0) as often as possible.
2. Sharing, but not immediately
When data cannot be shared immediately for whatever reason (for example, when more time is needed for further research), sharing should be done through the repository. In such cases, the metadata for the dataset will be openly accessible, but the actual data files will be made available only after a certain period of time. This method of sharing allows researchers to set access conditions, terms of use, and citation requirements, and gives a permanent identifier to the dataset that can be used for citations in research papers and in some cases for complying with donor requirements.
3. Sharing, but not openly
Data cannot always be shared openly. There are a variety of reasons for this—for example, if there is personally identifying or sensitive information contained in the data, if the dataset is being considered for a patent, or if raw data are not fully cleaned. Researchers may need to share data with collaborators or partner organizations before making them publicly available. In such cases, data use agreements are the best tool for protecting the researchers and the data. Data use agreements must be made when sharing data containing identifying or sensitive information.
There are two types of data use agreements: nonconfidential and confidential.
- Nonconfidential data use agreements are used when there is no personally identifying or sensitive information in the data to be shared. Such agreements help safeguard research ideas by setting certain terms and limits on the use of the data. These agreements are also useful when more time is needed to do consistency or quality checks on raw data received from the field.
- Confidential data use agreement agreements, also called nondisclosure data use agreements, are mostly used when sharing data that contains personally identifying or sensitive information. Such agreements lay out the requirements for the storage, security, and handling of datasets. We recommend always putting such an agreement in place before sharing datasets containing personally identifying or sensitive information.
IFPRI has standard templates for nonconfidential and confidential data use agreements (for IFPRI staff only). IFPRI staff should develop data use agreements based on these templates.