Making data accessible
The majority of funding bodies view publicly funded research data as a public good produced in the public interest. UKRI (formerly RCUK), Wellcome, and Universities UK published the Concordat on Open Research Data (PDF), which describes 10 key principles for ensuring that your research data is “openly discoverable, accessible, intelligible, assessable, and usable” by others.
Making research data open and accessible brings significant benefits, both to the research community and the society which hosts and funds it. The Open Research team is here to support research data management as an essential building block of good research practice and a pathway to increasing the exposure and impact of our research. From planning to preservation, we help researchers make their data more open and transparent throughout the research lifecycle.
Create robust data management plans
Data management plans ensure a project’s research data is created, managed, documented, shared, and preserved in a way that enables easy replication and reuse. It is the roadmap from planning to preservation for the data produced by a project.
We provide advice, training and guidance to ensure our researchers develop data management plans that address the following.
Read our Research Data Management Policy (PDF) to find out more about managing your research data.
- What data will be collected/created? (Types, formats, size)
- What existing data can be reused?
- How data will be organised and documented for future use?
- How data security will be managed? (storage, backup, access for collaborators, and access controls)
- Who will be responsible for each of these elements/activities?
- What data will be shared? Where and when will it be released?
- Who will have access to the data? Under what license?
- What data will need to be preserved? Where and for how long?
- Is there any physical data that needs to be preserved? How will this be done?
- Whether any equipment, staff time, or software need to be costed to help manage, share or preserve the data
Writing your data management plan
When you are ready to start writing your DMP, you can set one up using DMPOnline, a tool with templates for all the major funders and specific guidance based on funders’ policies. If your funder hasn’t provided a template, or there is no funder, use the generic template, which also has plenty of advice.
All you need to do is:
- Sign in to DMPOnline using your University of Surrey credentials
- Click on 'Create plan' and select your funder on the dropdown (if your funder isn’t listed, or you don’t have one, please use the generic template)
Once you’ve set up your plan, you can share it with collaborators. When it’s finished, you can export it to use in your grant application.
Things to consider
Not all research data needs to be shared. It is important to take the time to consider which and how much of your data will need to be preserved and made openly accessible. Funding bodies and the University all agree that the best person to make this decision is you the researcher, either on your own or together with your project group as appropriate.
You’ll need to keep any data that underpin your publications and any data which might be of use to other researchers in the future.
For more detailed guidance, see the Digital Curation Centre’s five steps to decide what data to keep.
In summary, these describe:
- Identifying purposes that the data can fulfil (e.g. verification, learning & teaching, etc.)
- Identifying data that must be kept (i.e. for policy or legal reasons)
- Identifying data that should be kept (i.e. because there is or could be a demand for the data, because it would be difficult or costly to replicate the data, or because it is unique or valuable in some other way)
- Identifying and weighing up the costs of keeping the data (long-term storage, curation, etc.)
- Completing your data appraisal (weighing the value of the data against the costs of preservation, including time and money already invested, and risks of loss, etc.; listing data to be preserved, reasons for not preserving data, and summarising actions)
If you have used data supplied by a third party or collaborator there is no need to preserve or share unless you have made significant changes to it. Simply cite the data creator and location of the original data in your publications.
Be sure to include sufficient documentation alongside your data. You may want to consider transforming your data into more preservation friendly open formats. For more information see our guidance on how to make your data more open (below).
There are a number of things you can do to ensure that your research data is as open as possible, by making sure your data is:
Make sure your data is understandable
Organise and describe your data
Good organisation and documentation is the best way to ensure that your data remains easily accessible and understandable for the duration of your project and beyond.
Think carefully about how to organise, name, and document your research at the start of a project. For tools and tips, see the Center for Open Science.
Develop a file naming convention and use a hierarchical file structure that keeps files together in a logical way based on data collection, analysis, and documentation.
- Make files names short, meaningful, and unique
- Use underscores rather than spaces between words, i.e. 20150210_Interview_EH
- Use dates in the format YYYYMMDD so files can be ordered chronologically
- Use agreed abbreviations only – and use them consistently
- To allow machine sorting use family name_initial or family name_name
- Use two digit number e.g. 01-99 (unless larger number or date)
- File names of records relating to recurring events should include the date and a description of the event
- Put the date of recurring event at the start of the file name to allow machine sorting
- Indicate the version number of a record in the file name by the inclusion of ‘V’ followed the version number and, where applicable, ‘Draft’ or ‘Final’. Numbering to include decimal V1, v1.1 for minor revisions and V1, V2 for major revisions
- Consistency is key! Choose a style/naming convention that suits both you and the types of file you are keeping, and stick to it
Make sure your data is findable
Depositing your data into a publicly accessible data repository is the easiest way to make your data more discoverable. See our advice on getting your research discovered.
Create a data statement
A data statement describes how and on what terms your research data may be accessed. You will need to include one in your research papers.
What to include:
- Name(s) of data repositories, if used, along with the DOI
- Any access or licensing conditions/constraints
- Any legal or ethical reasons why data cannot be made available
- Full access: ‘The authors confirm that all data underlying the findings are fully available without restriction. Details of the data and how to request access are available from the University of Surrey: [DOI link]'
- Non-disclosure agreement: ‘Owing to confidentiality agreements with research collaborators, supporting data can only be made available to bona fide researchers subject to a non-disclosure agreement. Details of the data and how to request access are available from the University of Surrey: [DOI link]'
- No consent from participants: ‘Owing to the [commercially, politically, ethically] sensitive nature of the research, no interviewees consented to their data being retained or shared. Additional details relating to other aspects of the data are available from the University of Surrey at [DOI link]'
- Dataset under commercial embargo: ‘Supporting data will be available from the University of Surrey at [DOI link] after a 6 month embargo from the date of publication to allow for commercialisation of research findings'.
- No new data: ‘No new data were created during this study’
Make sure your data is reusable
Apply a license to your data
If you apply a license to your data it will be easy for people to know how they can reuse your data. We recommend using a CC BY license to enable the broadest use of your data. See the Copyright and licences section.
Documentation and embedded metadata
Good documentation is essential for other researchers to understand and reuse your data. Ask yourself: “What information would I need to find, understand and use this data in twenty years?” That’s what your documentation should include.
We recommend that each of your datasets be accompanied by a ReadMe file which describes what data is included, how it was created, and how to understand and use the files and documentation.
Embedded metadata is a useful way of adding vital information about a file or dataset, within the file itself. Without it, you and your collaborators may struggle to interpret your data when you come back to it in the future.
Embedded metadata includes:
- Code, field, and label descriptions
- Descriptive headers and summaries
Official University metadata record
We will create an official publicly discoverable metadata record of where your data is held, such as in an external repository. Just email email@example.com to let us know where you have deposited your data. We can also create metadata records for non-digital/physical data or data held at the University that cannot be deposited in a data repository.
Although funders do expect you to share your data, they also recognise that there are legitimate reasons why you may not be able to do so. These include:
|Commercial potential or interests and contractual terms||Sharing may still be possible under licence (e.g. CC-BY Non-Commercial) or subject to a Non-Disclosure Agreement|
|Data belongs to collaborators or a third party||Limited sharing may still be possible if subject to a Non-Disclosure Agreement|
|Personal or private information||Sharing may still be possible if explicit consent to do so is obtained; see the UK Data Service on consent and other ethical/legal issues|
|Sensitive information which could compromise unprotected intellectual property or, in the judgment of the security services, result in unacceptable risk to the citizens of the UK or its allies|
If you think you may need to restrict access to your data, you must outline the reasons for doing so in your DMP. You should also include your reasons in any metadata (which will be publicly available), and the data access statement which accompanies any publications based on the data. See our advice on data access statements in the section Making your data more open.
Ensure the use of sensitive data is only restricted when truly necessary
Making data more open requires careful handling when there are legal, ethical and commercial considerations. Some things we consider include:
- Will the data being handled fall under the General Data Protection Regulation (GDPR)?
- Are any consent forms written to allow data sharing and reuse?
- Is a security standard required for storage?
Surrey uses a multitude of ways to share sensitive or commercial data to provide transparency, allow reuse of data, and meet funder expectations.
Some options include:
- Applying a licence to restrict types of use
- Requiring a non-disclosure agreement
- Providing a de-identified or aggregated version of data or a subset of data
- Using a data repository with restricted access options
Unless funder requirements state otherwise, you need to ensure that your data are retained for a minimum of 10 years. Even if your data cannot be shared, they still need to be preserved in a secure environment.
Please note: USB sticks, personal laptops, project websites, and the hard drive of your computer do not count as ‘secure storage’ for long term preservation.
The job of a disciplinary repository is to store, maintain, and disseminate research data. They manage your data, facilitate discovery of it, and preserve it in a safe, secure environment.
We strongly recommended that you deposit your data in a recognised disciplinary repository. Your data are much more likely to be discovered, reused, and cited, if they are alongside other work in the same and similar fields.
Some funders and journals require that research data be deposited in a specific repository. For example:
- ESRC expects most grant holders to deposit their data with the UK Data Service
- NERC expects grant holders to deposit their data in one of the NERC data centres
- Nature Research journals expects authors to deposit their research data in one of the mandated repositories
Similarly, both Wellcome and the BBSRC provide lists of recommended data repositories for their grant holders:
Other funders, however, do not identify a particular repository, only requiring researchers to ensure their data is made available with the minimum of access restrictions. For example:
- The EPSRC require grant holders to make research data available and to preserve it for a minimum of 10 years from the date of the last access request
- PLOS ONE require authors to make data underpinning their publications available without restriction
- A number of journals have agreements with Dryad to enable their authors to store the data underpinning publications
Choosing the best repository for your data will depend largely on your field of research. For advice on where to look and what to look for, try the Digital Curation Centre’s Where to keep research data and OpenAire’s How to select a data repository
It is also a good idea to speak to your colleagues: if a repository is well-known and well-used by researchers in your field, they will probably know about it.
If your colleagues cannot help, there are a number of curated lists:
- Scientific Data List of Recommended Data Repositories
- PLOS ONE List of Recommended Repositories
Once you’ve deposited your data into a repository please email the details to us at firstname.lastname@example.org so that we can create an official university record in our data catalogue.
For physical/non-digital data: the same funder and university requirements apply to physical/non-digital data. You will need to find appropriate, secure storage where the data can be accessed on request.
If you need help or advice with this please contact email@example.com.
- Check funder requirements for open data
- Check funder allowances for costing publications/open access, RDM, sharing, and preservation
- Create a DMP for your project
- Ensure consent forms allow sharing and preservation of data when appropriate
- Consider your IT requirements; contact your faculty IT team if needed
- Discuss roles/expectations about ownership/stewardship of data with collaborators
- Discuss roles/expectations about how data will be shared and preserved
- Create standardized workflows and protocols for collection and analysis
- Create a file structure and a naming convention for files.
- Systematise documentation of findings and decisions (e.g. using metadata standards in your field)
- Ensure sensitive data is handled properly (on secure servers, encrypted, etc.)
- Store and regularly back up your data on university secure servers
- Update and revise your DMP to keep it current
- Include a data access statement on your publications (get a DOI for your supporting data)
- Prepare your data for sharing and preservation (anonymize, clean the data, etc.)
- Determine and address any IP/copyright issues around sharing your data
- Create open format versions of your data
- Create formal documentation and metadata (e.g. readme file, data dictionaries)
- Identify an external repository to house your data (N.B. make sure it will share AND preserve. the data) No suitable external repository? Deposit it in the Surrey data repository
- Obtain a DOI for your data so it can be cited
- Choose a license for the data and create any user agreements
If using an external repository, ask the Open Research team to create a record in the University’s data catalogue.