

Managing and storing data
Discover how to describe, document and store your research data.
What is research data management?
Research data management (RDM) refers to how you will generate, collect, organise, document, format, store, publish and archive your research data throughout your research project, in ways that support its discoverability, potential sharing and re-use, and preservation. How you will ensure good quality data will be generated, recorded and securely held. These practices are informed by legal, statutory, ethical and funder requirements. This management is applied from the planning/bid stages of a project through to data collection and analysis, day-to-day management and finally long-term preservation and sharing even beyond the project end date. Consider who is going to be responsible for doing these activities and plan a schedule.
What we consider to be research data
The University of Surrey considers research data to be any material collected, observed, measured, processed, or created for the purpose of analysis upon which research findings and outputs are based. This includes data and documentation which are commonly accepted in the scholarly community as necessary for validation or replication of research findings. Research data may be in digital or non-digital formats (a comprehensive list of different data types is given here sharing and preserving your data).
It is a University of Surrey expectation for all data that underpins publications, substantiates research findings or is of long-term value is shared by archiving/uploading it in a repository. Please refer Points 4.3 and 4.6 of the Research Data Management Procedure (University of Surrey Research Data Policy (PDF) for further guidance.
Why manage your research data?
At its heart, good research is a direct outcome of good research data management. Having a strategy/plan for how you are going to manage your data and documentation during your project will make every stage of research easier, more transparent, reproducible and more secure, especially when it comes to sharing, publishing and archiving your data for verification and potential reuse.
Documentation
Documentation is the foundation of good research and should be started early. It makes your research understandable, reproducible, verifiable, and reusable – first for you and then for others. Imagining what future users would need to understand your data can help you explain and assemble the best documentation for your project.
It can be embedded within research files, like in code, scripts, headers, summaries, label descriptions or built-in program documentation. One of the best ways to ensure the quality of your data is to automate your data creation or analysis as much as possible, which in turn becomes indispensable documentation. Take a look at an example from biology.
Documentation exists at several levels, for example:
- Project or study level: project description/abstract, research questions, methods, reports, protocols, lab books, consent forms, questionnaires, instrument instructions, and the background/context of data collection and analysis
- File or data level: what each file contains, how files relate to each other, the components, structure and logic of data files
- Variable level: user guides, code books or data dictionaries with definitions of variables, ranges, units and abbreviations
- Metadata level: structured descriptions of a study or dataset consisting of defined elements to facilitate discovery and reuse, usually created as part of a data repository deposit; sometimes discipline specific. (All of Surrey’s shared and preserved data must have a metadata record in our repository).
Resources
The UK Data Service provides extensive advice on how to document your data, including data level documentation and study level documentation. More documentation to consider:
- Creating a README file for your project
- Using electronic lab notebooks (Cambridge’s guide and comparison table)
- Registered Reports
- Publishing your protocols.
Training on research data management is available through the Doctoral College or the Library Research Hub Workshop Descriptions.
Storage and collaboration
The biggest risks to research data are accidental loss/corruption and unauthorised access. We can mitigate those risks by adopting a few simple practices for storing our data.
Use University storage
During active research the best place to house your data is on University storage, where it will be regularly backed up and subject to greater access controls behind sophisticated firewalls. It also utilises up-to-date anti-virus and anti-malware software systems. This includes the University’s SharePoint or OneDrive software.
All researchers are reminded to regularly update their passwords, lock their screens when they walk away from them and ensure their screens are not overlooked maybe through external windows.
Resources
- UK Data Service’s guide to managing and sharing data
- Qualitative Data Archive’s Managing Qualitative Data module
- JISC research data management toolkit
- Messy data? Try Open Refine
- Data and software carpentries curricula
- Software Sustainability Institute's top tips
- PLOS best practices in research reporting.