8. Data preservation
Learning objectives
When you have completed this lesson, you will be able to:
- Explain the advantages of preserving data after project end.
- Describe what should be considered when determining what data will be preserved and how to preserve it.
- Draft your own data preservation plan.
____________________________________________________________
Why preserve data after your project?
When you reach the end of your project, you will have finalized the collection, processing and analysis of your data. You will likely capture your main conclusions in a document, such as a bachelor or master thesis. However, research data management does not end with the submission of your thesis. There is one last step to take: the preservation (also called ‘archiving’) of your digital data, physical material and relevant documentation. Data preservation is defined as the actions you take to ensure that your research data are kept in a way in which they remain available and usable, possibly for several years after your project. Data preservation is important because it will allow your future-self, other project members and your supervisor to revisit the material and understand how you conducted the project. This could provide insights useful for similar studies, and possibly allow the research data from your project to be reused in new projects.
____________________________________________________________
What to preserve
You do not necessarily have to preserve everything you produced in your project. You will have to decide what to keep and what to discard. For example, you could consider keeping:
- The data sets on which you base the conclusions described in your thesis. Perhaps you need to consult them again in the future when questions arise about your thesis?
- Any data that may end up being the basis of a scientific publication.
- Data that you assess to be valuable to others. Perhaps they can be used in another project?
- Data that were based on rare samples or specimens. These data are not easily recreated.
There may also be some data that you cannot preserve, for example:
- Personal data, if the participants have not consented to you keeping their data after project end.
- Confidential data, such as business information, if the contract with the company you worked with requires data destruction or a return of the data to the company.
- Very large data files or physical material, if there is insufficient storage space or funding to keep them.
- Any physical material that will deteriorate in quality over time so that they cannot be used for new projects.
Always discuss your plan for data preservation with your supervisor. What data do they want to be kept after the project? Are there any requirements or rules for data preservation in your research group, at your institute, or the place where you conducted your internship? Is it costly to preserve the digital data and material, and if so, are there sufficient resources available to cover these costs? Don’t limit your discussion to digital data only; also think about what to do with code and codebooks, (lab)books, specimens, samples, artifacts and any documentation in your project.
____________________________________________________________
Making a data preservation plan
It is a really good idea to make a plan for the preservation of the data, materials and documentation that you decide to keep after your project. You could describe your plans in your project’s data management plan. Consider the following questions:
Where should I preserve my data and material?
Where you should preserve your data after project end depends on a number of decisions:
- Who should have access to the data after the project?
Is it only you, or should your supervisor or other project members also have access? If it is the latter, you will have to store the data in a location that remains accessible to others (and perhaps yourself) when your enrolment at the university ends. This will require moving your data, for example from your personal university network (T) drive to the research group’s common network drive. Please note though, that you will not have access to this group drive yourself after the end of your enrollment.
- How long should your data be preserved?
The longer you want to keep your data, the more important it is to check whether that length of storage can be guaranteed. Your personal T-drive will close down three months after your enrollment ends, so it is not a good place to preserve data after storage. DeiC Storage (see lesson 6) guarantees 10 years of storage, and that may be sufficient for your data. If you end up publishing your data in a data repository (see lesson 7) this could also be a place to preserve data, but you will need to check the policy of the data repository to see how long they guarantee to store the data. If you work with normal data, not personal data or confidential data that is, (see lesson 6) you could also consider keeping a copy of the data on a portable hard drive that you keep stored at home.
Data repository vs. Data archive In Lesson 7 you were introduced to the concept ‘data repository’.
This is reflected in how the databases are set up and what functions they offer. Some data repositories also guarantee a minimum storage period (e.g. 10 years), whereas some data archives also offer sharing functionalities. In other words, you may also be able to use a data repository to preserve your data. In the end, your choice of data repository/archive depends on your needs. Always check the policies of the data repositories / archives to see what services they provide. |
What data format should my data be preserved in?
One consideration you should make is the format in which to save your digital data. Some file formats require specific software to be opened. Is it reasonable to expect that it is possible to access this software in 5 years from now? If the answer is ‘no’, it is best to preserve the data in a different format that can be opened and viewed on any operating system using any kind of software (‘open format’). Saving your data in open, unencrypted and uncompressed formats will make your data more usable in the future. If you can’t save your data in an open format, then make sure to include the name of the software needed to open the file in your project’s documentation (for example in a Readme file saved with your data, see Lesson 5)
Here are some examples of common open formats:
Text
|
|
Tables, spreadsheets
|
|
Images
|
|
Audio
|
|
Video |
|
What documentation should accompany my preserved data and material?
Preservation of data and material only makes sense if they can be found and understood in the future, by yourself and others. Therefore, make sure that documentation about content of data files and physical material is easily findable and accessible. Ensure that:
- You properly label any physical objects with a date, name and keywords describing the project, and with information about any digital data associated with the material.
- You use informative filenames for the files you store.
- You preserve information and metadata describing how you collected and processed the data and material along with the data, such as a project plan or protocol, your data management plan, or ReadMe file.
- You tell your supervisor where the documentation can be found.
Read lesson 5 for more tips on data documentation.
Example demonstrating the importance of proper documentation for data preservation The in-house storage of the skeleton of an enormous dinosaur could go unnoticed for decades, because the dinosaur bones hadn’t been catalogued, there were no real records of them and some bones had been used in other displays with no indication that they were part of a whole. On top: A part of the long-lost barosaurus skeleton found in the archives of the Royal Ontario Museum. |
____________________________________________________________
Data preservation in practice
Morten Arendt Rasmussen, supervisor at the Faculty of Science, explains which types of data he believes should be preserved after project end.
If you experience access denied, reload the page or try another browser
For English subtitles, please look for the CC icon in the lower right corner of the video and press English.
Supervisor Nicole Schmitt, supervisor at the Faculty of Health and Medical Sciences, addresses the long-term preservation of data from student projects.
____________________________________________________________
Test yourself
Check whether you captured the main points of this lesson:
____________________________________________________________
Finish your DMP
Complete your data management plan (DMP) by filling in the last questions under section 8.Preservation:
8.a Describe what data/material/project documentation should be kept once your project is over.
8.b Describe where the data/material/project documentation will be stored after project end, and how a copy of the data will be made available to your supervisor(s).
If you haven't begun filling out your DMP yet, you can find the DMP template here: UCPH DMP Template for Students
Download UCPH DMP Template for Students .
Remember to discuss the data management plan with your supervisor at the start of your project. Keep the DMP stored along with your data.
____________________________________________________________
Practical tips for preserving data
- Always start by checking whether there are any rules for the preservation or destruction of your data. Some hints for where to find these rules are in the following tips.
- Check UCPH’s general rules for data preservation in UCPHs Policy for Research Data Management. For example, according to the policy, a copy of all digital datasets underlying research publications must be kept at UCPH for a minimum of 5 years after the end of the project, or the date of publication, whichever comes last. Bachelor and Master theses are not strictly considered research publications. However, it is recommended that you, by default, preserve a copy of the data underlying the results presented in Bachelor and Master theses for a minimum of 5 years after the end of your project, unless otherwise determined by your supervisor.
- If you work with personal data, check the guidelines for preserving personal data on the study information pages of your study programme under Planning your studies > Rules and dispensations > How to collect and process personal data.
- If a contract has been set-up in your project, for example because you work with a company, check the terms and conditions for data preservation in the contract.
- Determine what IT infrastructure you should use to preserve your digital data. Here are some suggestions for data classified as normal data (see lesson 5) where a maximum of 10 years of storage after the project is sufficient:
|
UCPH Group Drive |
DeiC Storage |
A data repository (see lesson 7) |
You will need access to data after enrolment ends |
Not suitable |
Suitable |
Suitable* |
Other UCPH students/employees need access |
Suitable |
Suitable |
Suitable* |
Externals can be given access |
Not suitable |
Suitable |
Suitable* |
* You should still check the conditions of the repository you pick, to ensure, for example, that it guarantees at least 5 years of storage.
Please note that other solutions are necessary for personal data and confidential data. Other solutions may also be necessary for data classified as normal data, when for example dealing with very large data volumes, specific file formats or other specific requirements. Lastly, research groups may have their own databases in which data are preserved. Always ask you supervisor first and consult KU-IT if you need help.
- Preserve your digital data along with a ReadMe file that explains the data. Here is a template that you could use: ReadMe template
- Look up terms related to research data management in the RDM Glossary.
____________________________________________________________
Published in 2024