
- Introduction
- Getting started
- Open Research training
- Teams
- Open Access
- Preprints
- Discovering and citing Open Access resources
- Open data
- Managing and storing data
- Data management plans
- Authorship and contributorship
- Make your research discoverable and visible
- Funders’ requirements
- Reproducibility
- UK Reproducibility Network
- Preregistration and Registered Reports
- Open peer review
- Copyright and licences
- Case studies
- Events
- News
- Open educational resources
- R4RI-like narrative CVs
- Responsible use of metrics


Open data
Read through our guide on making research data open and accessible.
What is research data?
The University of Surrey considers research data to be any material/information collected, observed, measured, processed, or created for the purpose of analysis and on which research findings and outputs are based. This includes all data and documentation which is commonly accepted in the scholarly community as necessary for validation or replication of research findings. Research data may be in digital or non-digital formats. This could include:
- Audio, video, slides, images and photographs
- Text documents, databases and spreadsheets
- Code, scripts, algorithms, models, and software
- Interview schedules, transcripts, protocols and methodologies
- Specimens, samples and test responses
- Collections of digital or physical objects including implements, artefacts and tools
- Lab notebooks, field notes, and diaries
- Questionnaires, surveys, user guides, data dictionaries and codebooks
- Paintings, sculptures, costumes and art works
- Blogs, webinars, games, musical/theatrical/dance compositions and scores
Why share your data?
Sharing data that underpins research conclusions is at the heart of academic inquiry. Data sharing for verification and reuse can catch errors earlier, foster innovative uses of data, and push research forward faster and more transparently to the benefit of the field. Beyond academia, data can be used by many including policy makers, entrepreneurs, and the public. There’s also evidence that sharing data leads to more citations, greater visibility of your work, and potential collaborations and opportunities. For more check out these five selfish reasons to work reproducibly.
Of course, not all data is suitable to share openly in which case data can be shared with a range of appropriate restrictions. Be sure you have consent or permission from your participants, collaborators, partners, or supervisor before sharing any data. Once you have identified which data is shareable, you should consider if it is worthy of sharing and then apply appropriate safeguards. If your data cannot be shared openly, but has long-term value, then it should be archived with Restricted Access.
Sharing your research data
It is a Surrey University expectation for all data that underpins publications, substantiates research findings or is of long-term value is shared by archiving/uploading it in a repository. Please refer Points 4.3 and 4.6 of the University of Surrey Research Data Management Procedure (PDF) for further guidance.
Data sharing should strive to be as “as open as possible, as closed as necessary.” Ask yourself: what data is necessary for verifying or reproducing your research findings or which data could be reused? Creating data that is easily verifiable or reusable will require some planning and preparation. It’s best to plan for data sharing and build it into your research project before you start. (A data management plan is a good way to do this.)
Of course, you will want to make sure you have permission to share data from your project so be sure to first check any requirements specified by your funder/grant provider and to include data sharing in your consent forms. Check out the UK Data Service’s advice on consent forms for data sharing. If you have an industry partner or other collaborators, you should jointly agree on any data sharing before the project begins. Sharing may be constrained by ethical, commercial/IP and/or legal reasons.
In most cases, the best place to share data is through a data repository. These are online platforms designed to hold and disseminate research data. Some are discipline specific and others take all types of data. Repositories provide several advantages over trying to share data yourself, they can:
- Rank highly in search engine results
- Provide a persistent identifier e.g. Digital Object Identifier (DOI) for your data for use in publications and citations
- Track view/download counts
- Allow versioning
- Facilitate access requests
- Provide long-term storage of your data.
Option 1: Identify a suitable external repository
- Does your funder require or recommend a particular repository? Some funders have their own platforms or recommend certain repositories, like Wellcome Open Research, Gates Open Research, and ESRC’s UK Data Service
- Is there is a repository typically used in your research discipline? Public platforms like Zenodo and Open Science Framework accept all types of data. Some publishers may recommend certain repositories.
Please note: When you share your data in an external repository, you will still need to register/create an official record in Surrey’s Open Research repository of where the data is held.
Option 2: Use Surrey’s Open Research repository
If an external repository is not recommended, use Surrey's Open Research repository, which accepts a wide variety of research outputs. Please use our Research data deposit guide (PDF).
Whether you are creating a university record indicating the external location of the data or uploading your datasets in the University repository, follow the steps below:
- Visit the Open Research repository
- On the top right corner, select Surrey Researchers sign in (use your university username and password)
- Once logged in, select the 'add content' button (top right corner) and choose ‘Output’
- Select “asset type”. By 'asset’ the system means the type of research output (for example, article, book, etc). Select ‘Dataset’
- If you are registering/creating a record of your dataset, go to ‘Add links to files' to indicate where the dataset files have been archived, i.e. URL location
- If you are uploading your dataset files directly, drop or select the files to upload
- Remember to add the DOI if your dataset record already has one, or reserve one in Surrey’s University repository if your data doesn’t have a DOI
- Please register/create a record of your data even in cases where the datasets cannot be shared openly and assign Restricted Access to the files
- If you have specific requirements for your data or would like more guidance, contact openresearch@surrey.ac.uk.
Data can be shared anytime! Some disciplines share data almost immediately. Others tend to do it alongside a publication. Some funders suggest specific timelines for sharing data usually tied to publications, project end dates, or norms within your discipline.
Your journal may stipulate a timeframe for data sharing as a condition for publication. Surrey’s own Research Data Management Procedure (PDF) requires sharing data that underpins publication within 12 months (or sooner if required by funders).
If you don’t have a funder or your funder doesn’t specify a timeline, then follow Surrey’s policy. Exceptions to funder expectations and Surrey’s policy should be outlined and justified in the project’s data management plan.
We recommend the following best practices when sharing your data to make it easier to find and reuse. Of course, your data should be well organised, labelled, and accompanied by sufficient documentation so that others can understand and reuse it. In addition:
- Create a README file for shared data
- Use an appropriate data repository or Surrey’s repository
- Get a DOI for your data (available from repositories as part of the deposition process)
- Include a Data Access Statement (DAS) citing your data's DOI (see below) in your publications
- Apply a licence to your data. Some funders recommend specific licences.
Check your data is “FAIR” i.e. Findable, Accessible, Interoperable, and Reusable. The FAIR principles outline best practices for how to share data. The CARE Principles for Indigenous Data Governance provide a complimentary set of people-focused best practices.
‘Open data’ encompasses a wide range of sharing practices allowing researchers the flexibility to balance transparency and appropriate protections for their data. Data repositories have a range of access controls that can be applied to sensitive data. Some data repositories can even handle very sensitive and personal data, like the UK Data Service, which accepts clinical trial data. Funders also recognise that data may need to be restricted for commercial reasons.
Depending on the sensitivities of your data your open data practices might include:
- Sharing some files as openly available and others as restricted data
- Transforming the data to make it more shareable, e.g. de-identification or aggregation
- Restricting access and setting terms of access, e.g. only by registered researchers or after signing a confidentiality agreement
- Creating synthetic data with the same characteristics as your data
- For use in verification purposes only, subject to signing a non-disclosure agreement
- Only creating an openly available metadata record outlining the study, summarising what data is held and why it is not accessible.
The Research Integrity and Governance Office and Data Protection team can provide guidance if you are unsure.
If your data has commercial potential, please ensure that you have read and followed the University’s Intellectual Property Code, and contact the Technology Transfer Office (techtransferteam@surrey.ac.uk).
While not all data may be suitable for sharing immediately, any data with long-term value should be archived and ideally preserved. Most repositories are able to archive/store uploaded data files for a set period of time (e.g. at least 10 years in Surrey’s Open Research repository) but a repository with a long-term preservation capability is most desirable. Unfortunately, that is very costly as all data files are then routinely downloaded and checked to ensure the data is still present, there has been no deterioration or corruption in the data and any hardware/software needed for access has not since become obsolete. Preservation involves taking all necessary steps to guarantee the data remains accessible and usable indefinitely.
To increase the likelihood of data survival and reusability, you should:
- Consider the cost of data storage and only archive data that underpins your research findings or is of long-term value and potentially reusable
- Organise your files/folders and generate accompanying documentation that fully explains your data
- Make sure your data files are in a widely used, stable, non-proprietary format
- Check your grant details as your data may be subject to statutory or funder requirements for preservation.
Please note: USB sticks, external storage, personal laptops, project websites, and local hard drives are not suitable for archiving or long-term preservation.
Physical data with long-term value should also be preserved. If you can’t make a digital surrogate of the physical data, then you can create a metadata record in Surrey’s repository indicating what physical objects are held, where they are stored and how they can be accessed (including the custodian’s contact details).
The Digital Curation Centre has a useful guide for preservation, five steps to decide what data to keep, and Jisc’s Research Data Management Toolkit includes a section on preservation. Software Sustainability Institute offers guidance on software preservation.
Data access statements (DAS)
A data access statement (also referred to as 'data availability' statement), is a short statement added to a research paper, to inform the reader:
- Whether there is research data associated with the paper
- Whether that research data is available, and if so, where and under what terms it can be accessed
- Whether that research data is restricted, and if so, the reasons why.
The University's Research Data Management Procedure expects you to include a data access statement in your publications. This is in line with requirements set by some research funders, including UKRI (UKRI OA policy, Appendix 1): "in-scope research articles to include a Data Access Statement, even where there are no data associated with the article or the data are inaccessible".
Many journals support the inclusion of data access statements and provide relevant guidance. See examples from Springer Nature, Taylor & Francis, and PLOS.
You can also use examples provided below, if a journal does not provide its own guidance.
Data access statements should include:
1. Terms of access (if any).
2. Persistent identifier (e.g. DOI) linking to the data in a repository; or where the data can be found (e.g. a third party); or Reference Number and location/contact details where the physical data/item can be found
3. If the data is restricted, a statement justifying why
4. If there is no data, a statement saying that
5. If all the data required to verify the findings appears within the publication, a statement saying that.
- The data underlying this article are available in [repository name, e.g. the xxxx Repository], at https://dx.doi.org/[doi, or give [URL] and state the access conditions and Licence [e.g. Open Access under CC BY or CC BY-NC-ND, etc.]
- The data underlying this article were derived from sources in the public domain: [list sources, including the URL/DOI]
- This publication is supported by multiple datasets that are openly available at locations referenced in this paper.
If the data is already included in the paper:
- The data underlying this article are available in the article / in the online supplementary material.
- The data underlying this article are subject to an embargo of [period of embargo of X months from the publication date of the article] to allow for commercialisation of the results. Once the embargo expires the data will be available [give details of availability, e.g. in a repository plus embargoed link; upon reasonable request, to the corresponding author, etc.]
- The data underlying this article cannot be shared publicly due to [briefly describe why the data cannot be shared, e.g. for the privacy of individuals that participated in the study, absence of consent or other ethical/commercial/legal reasons]
- The data underlying this article were provided by [third party] under licence / by permission. Data will be shared upon request to the corresponding author with permission of [third party].
No data were created, collected, measured or analysed in this study.
Resources
- Data Sharing - a UKRN animated primer
- Top Ten Tips for Doing Open Science
- Qualitative Data Archive’s Sharing Qualitative Data module
- Opening up and Sharing Data from Qualitative Research: A Primer
- Making data meaningful: guidelines for good quality open data
- Data sharing practices and data availability upon request differ across scientific disciplines
- The Qualitative Transparency Deliberations: Insights and Implications
- Qualitative Data Sharing: Participant Understanding, Motivation, and Consent.