What is a data management plan?
A data management plan (DMP) is a written document outlining how you are planning to manage your research data both during and after your research project. The plan should address what types of data will be collected and how the data will be documented, stored, shared and preserved.
Why create a data management plan?
Data management plans ensure a project’s research data is created, managed, documented, shared, and preserved in a way that enables easy verification and reuse. They set out a roadmap for your data from planning to preservation, providing the backbone of good research data practices.
Most funders and universities, including the University of Surrey Research Data Policy (PDF), require you to create a plan. Your funder may have specific guidance or templates. You can include data management plans in your ethics applications and PhD confirmation documents, too.
How to write a data management plan
Depending on the project, plans can be very simple (a page or less) or highly detailed (multiple pages).
If you are writing a plan for a funding bid, you can budget costs to help improve the management, sharing, and preservation of research data. This could include staff time, software, technology, and resources to make your data more open. Check out this data management costing tool and checklist (PDF).
Below we provide some guidance on the main topics your plan should address, a helpful tool, and outline how we can help.
What types of files are you creating in your project? Will you be transforming the data for analysis? Outlining all the types, sources, and estimated size of data in your project will help you identify potential issues relating to storage, sharing, and preservation:
- List the characteristics of the data to be collected (e.g. quantitative, text, audio, video, code, etc.)
- Include the file formats/software and if they are open or proprietary. List relevant physical formats like lab notebooks here, too
- Consider outlining the file types you’ll be creating or transforming during collection and analysis
- Include the anticipated size of data, especially if it will require additional resources beyond the standard cloud storage allocation (currently 1 TB).
For more on file management see managing your data.
Good file organisation and documentation is good scholarship. Anyone should be able to understand your project, data collection, analysis, and files just by looking at your documentation. Remember, you may want to or be expected to share your data, and someone may want to verify, replicate, or reuse your data.
Describe the documentation and quality assurance strategies for each type of data during collection and analysis. Consider using a file naming convention and using built in documentation capabilities, like taking notes in code scripts.
- Outline what documentation you will create
- Describe workflows for systematic capture of study information
- Think about how you will add, update, and maintain the data and documentation
- Decide how you will track multiple files or versions
- Choose how non-digital documentation be handled
- Establish of there a relevant disciplinary standard for documentation and metadata* you could use
- Consider what documentation will be needed for shared/preserved data
- Consider creating a README document for shared/preserved data you’ll use during collection and analysis.
*’Metadata’ sometimes refers to a specific type of disciplinary metadata, which is a community agreed upon specification for structuring data and documentation. Digital Curation Centre has a list of disciplinary metadata plans.
For more on documentation, see managing your data.
- Describe where you will store your data during collection and analysis
- Be specific about the journey your data will take
- How you will keep your data safe from accidental loss and unauthorised access
- Decide if you will transfer data from a collection tool to do your analysis, e.g. voice recorder, field measurements, or online survey
- How and when will you do this? Every week? After data collection ends?
In almost all cases, research data should be kept on University storage. Avoid using local hard drives, portable storage devices, laptops, and tablets for storage to reduce the risk of accidental loss. Do not use third party storage like Dropbox, Google drive, etc. They offer less protection and are less secure than University storage. If your project has special requirements like high performance computing, highly sensitive data, or commercially owned data then consult IT Services, Ethics, or your sponsor for an appropriate set-up.
- Outline where you will store the data at every stage of collection and analysis
- Describe how and when you will transfer data if necessary, including deleting data off collection tools/storage
- Identify any ethical, legal or commercial issues with your data, e.g. identifiable data, copyrighted materials, patents, etc. How will you protect the data? (This could include transforming, de-identifying, or anonymising the data
- Identify who will have access to the data. How will collaborators have access to the data?
- Identify any special storage or computing requirements you may have
- Describe how you will securely store and maintain any non-digital data.
For more on storage, see managing your data.
Data sharing for verification and reuse is an increasingly important marker of academic integrity. Researchers are encouraged to make their data as open as possible. Your plan should identify what data will or will not be shared from the project. For data that can’t be shared, you should include a justification for why not.
- Make sure your consent forms don’t prohibit sharing/retention, and even better, ensure that they mention that de-identified data will be shared in an open repository
- Outline what parts of your data can and cannot be shared
- Describe and justify any restrictions or terms of access (restricted, NDA, etc.)
- When will the data be released?
- Is there non-digital data that needs to be made available? How will people request access (e.g. a publicly discoverable metadata record)?
- If the data cannot be shared, explain why (e.g. don’t own, national security, copyrighted)
- Will you transform the data? (e.g. de-identify or convert to an open format)
- Identify how you will share your data, such as depositing in a repository
- Consider applying a Creative Commons license to your shared data or code
- Check out the FAIR principles of data sharing.
Best practice is to deposit the data into a data repository. Repositories provide the best visibility, tracking, and safe keeping for your data:
- Identify a suitable repository. Consider a discipline specific repository that is most appropriate for your data. Check out PLOS’ list of recommended repositories or Scientific Data’s recommended data repositories
- You can also use the University’s Open Research repository.
For more information on sharing your data, please see open data.
As open as possible, as closed as necessary
For those new to data sharing there can be a dizzying number of things to consider. Remember that the what, where, when, how, and why of data sharing should be motivated by two goals: to verify your findings and for re-use.
Sharing data isn’t all or nothing. You can still engage in a culture of openness and transparency while appropriately protecting your data:
- Releasing some data publicly and restricting access to other parts
- Transforming the data to share it more openly
- Restricting access to bone fide researchers or on a case-by-case basis
- Outlining terms of access and/or applying a copyright licence
- Only allowing access for verification of findings and subject to a non-disclosure agreement
- Creating a public metadata record outlining what data is held and why it cannot be shared
- When sharing your data, send a link to email@example.com. We will create an official university record for your data.
For more on sharing, see open data.
Retaining data is an important part of the academic process. Even if data cannot be shared now, it may still have important historical value for future researchers, especially for non-replicable observational data. You should identify what data will be retained, where it will be stored, and who will oversee its safe keeping.
Consider the long-term viability of your file formats. Open formats (e.g. rich text or CSV files) are more likely to survive over time than proprietary formats (e.g. Word doc or Excel files). If your data is small enough, you may want to preserve a copy of your data in its original format alongside an open format. Be sure to include adequate documentation alongside preserved data.
- Identify which data should be preserved. This should be anything that underpins the conclusions of your project and any published works
- Identify what documentation you will include with the data to facilitate verification/reuse
- Consider transforming your data to an open format for preservation
- Identify where and who will be preserving the data and for how long
- If you shared your data in a repository, it may have a preservation policy you can link/refer to.
For more on preservation, see open data.
Writing your first data management plan can be daunting, but if you continue in academia it will be the first of many! And there’s help available if you get stuck (see below).
There are some real benefits to writing a plan. It provides a more structured way to think through your research process and your project’s outputs. This provides you an opportunity to discover Open Research practices in your field and to chat to your supervisor about professional expectations. You should check the following:
- Documentation: Are there any community standards for creating, documenting, and analysing data? This can facilitate verification and reuse
- Storage: Check with your supervisor or sponsor if your data has any special storage or handling requirements
- Permissions: You may need to secure permission to share your data. Be sure to chat to your supervisor before sharing any data
- Sharing: Are there any best practices for research data in your field? Like alongside a publication or in a specific repository?
- Preservation: Discuss what happens to your data when you’re done. Will you be depositing it or handing it back to the supervisor or sponsor? Something else?
Training on data management plans and research data management is available through the Doctoral College.
DMPOnline is an online tool designed to help researchers write their data management plans. They have templates for all the major funders and specific guidance based on funders’ policies. If your funder hasn’t provided a template, or there is no funder, use the generic template, which also has plenty of advice.
If you’d like to use DMPOnline, all you need to do is:
- Sign in to DMPOnline using your University of Surrey credentials
- Click on 'create plan' and select your funder on the dropdown (if your funder isn’t listed, or you don’t have one, tick “no funder” and use the generic template).
Once you’ve set up your plan, you can share it with collaborators to edit or view. When it’s finished, you can export it to use in your grant application, confirmation, or for your records.
How we can help
If you have any questions about data management plans, your funder’s requirements, etc. please get in touch with us at firstname.lastname@example.org.
If you are submitting a plan as part of a funding bid, we can provide you feedback on your plan. Please allow ten working days for our review. Send your plan to email@example.com.