RDM Glossary
Backup
A copy of data that is stored separately from the original data source, typically created to protect against data loss or file corruption. Backups are essential for ensuring data availability and integrity in case of hardware failures, accidental deletions, malware attacks, or other unforeseen events.
Biobank
A structured collection of human biological material accessible according to certain criteria, and where the information bound in the biological material can be attributed to individuals. The term ‘biobank does not refer to the physical place where the samples are stored, but the samples as such. According to Danish legal practice, biobanks are regarded as manual registers and are as such included in the data protection regulation of Denmark and the EU.
Collaboration Agreement
A formal document that outlines the terms, conditions, and expectations in a collaboration between multiple project parties, such as between universities, or between universities and companies. It typically addresses issues related to project goals, intellectual property rights, data sharing, funding allocation, publication rights, and dispute resolution mechanisms to ensure clarity, transparency, and fairness in the collaboration.
Confidential Data
Data other than personal data to which only a limited number of people should have access, and where accidental or deliberate exposure of the data can have considerate consequences. Examples are company data, data that have commercial potential, classified government data, sensitive biological data
Copyright
A legal right that grants the creator the exclusive rights over their work, including making copies, publishing, distributing, reproducing, modifying, adapting, transforming, publicly displaying and performing the work. To obtain copyright protection, the work must be original and in a fixed form. Examples of research outputs that can be protected by copyright:
- Writings and texts such as articles, monographs, contribution to books and anthologies
- Images and visuals such as figures, graphs, diagrams, drawings, photographs, maps, PowerPoint presentations, software, movies
- Audio and sound such as music, recordings of interviews, sound recordings
Data Archive
Storage facility for the long-term storage of data, dedicated to preserving, managing, and providing access to research data for future use.
Data Classification
Data classification is the process of categorizing the data and materials in your project based on their sensitivity or confidentiality, and on their importance. The purpose of a data classification is determining the appropriate level of protection and the necessary procedures you have to have in place to prevent breaches of confidentiality and data loss.
Data Documentation
The creation and maintenance of detailed records describing the characteristics and context of research data, including metadata. Good documentation will help others including your future-self understand the data, and will make it more likely that data are reused.
Data Format
File type of the data. This may include spreadsheets, video, images, or proprietary data types of equipment.
Data Licence
A legal instrument that communicates the terms and conditions for the reuse of data by others. Examples are Creative Commons licences, or Open Source Software licences.
Data Lifecycle
The data lifecycle refers to the various stages that research data go through from their initial creation or collection to their eventual preservation or disposal. This lifecycle typically encompasses several key phases, each with specific activities and considerations.
Data Management Plan (DMP)
A plan that is typically drafted at project start and that describes the actions to be taken in order to collect, process, store, secure, share, preserve, and possibly reuse, research data in a research project. DMPs are a good tool to align expectations between project members. Students can draft their own plan or use existing templates, such as those provided by UCPH.
Danish National Archives
A Danish national institution that aims to collect original physical documents and digital files of historical value and preserve them indefinitely, so that continued availability of this information is guaranteed.
Data Provider
A data provider is an individual or organization that supplies research data to users for various purposes, such as research, analysis, decision-making, or information dissemination.
Data Repository
A storage facility where researchers and students can deposit (upload) digital data sets and other digital research objects, as well as metadata associated with their project, for the purpose of sharing data with others.
Data Set
A structured collection of research data.
Data Size
The size of data generated, as individual units or in aggregate. This may be both analog for lab notebooks or specimens or electronic, described in MB, GB and the like.
Derived Data
Data created by combining and processing existing data, for example through text mining of literature or data mining of datasets.
Electronic Research Data Archive (ERDA)
Facility at UCPH for storing, sharing, analysing and archiving research data that are not sensitive. ERDA provides safe central storage space for own and shared files, interactive analysis tools in addition to archiving for safe-keeping and publishing.
Encryption
Encryption is the process of converting data into an unreadable code using encryption software. Encrypted data can only be opened by persons who have the relevant decryption key or password.
Ethical Approval
Approvals granted by ethics committees or boards who assess a project’s proposed measures to safeguard the rights, well-being and dignity of human participants and the welfare of animals involved in the project.
Experimental Data
Data collected under controlled conditions, often by manipulating a variable in a study and measuring the outcome. Examples are plant growth data under various light treatments or a parameter in a search query.
File naming convention
A systematic and standardized approach to name files consistently.
General Data Protection Regulation (GDPR)
Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC. The GDPR poses rules for whether, when and how personal data can be collected, processed, preserved, and shared with others.
High Performance Computing (HPC)
Calculations performed using many computers, namely compute servers (sometimes called nodes), linked together in parallel within a high-speed network. HPC systems enable handling large scale sets of data, and solving problems that require large-scale computation. HPC is used in many research fields, ranging from life sciences, physical sciences and mathematics, to medicine, linguistics and social sciences.
Information Security
A collective term for the actions taken to protect information (such as the data or material you work with in your Bachelor or Master project) from unauthorised access, (mis)use, disclosure, modification, destruction, or loss.
Informed Consent
The voluntary agreement of individuals to participate in a research study after being provided with comprehensive information about the study's purpose, procedures, risks, benefits, and their rights as participants. Informed consent ensures that participants understand the nature of the research, make an autonomous decision to participate, and provide their consent willingly, based on sufficient information provided by the researcher or student.
Intellectual Property Rights
Legal rights existing or granted with the intention of safeguarding creations of the intellect. Among other things, this includes copyright, patent rights, design rights and trademark rights
Interview Data
Data collected by asking questions to (groups of) individuals to gather quantitative or qualitative information, for example when studying cultural identity or when measuring user satisfaction with a particular service.
Legal Approval
Approvals granted by regulatory bodies, government agencies, or institutions who assess whether the project’s proposed actions adhere to relevant laws and regulations.
Metadata
Information describing the attributes of an item or data set such as sample name, units of measure, dates, contact information, which enables identification, retrieval and management of that item or data set in the future. Metadata can take many forms, from free text to structured machine-readable content. Some disciplines or data repositories may have specific requirements for the format and content of metadata, possibly using a formal standard.
Metadata Standard
A set of established guidelines, specifications, or conventions for describing and organizing metadata, to standardize how data are reported. This can help ensure consistency, interoperability, and usability across different research datasets and disciplines.
Observational Data
Data collected through the observation of an activity, for example sensor readings or observations of animal or human behaviour.
Open Access
Free, unrestricted online access to research outputs such as journal articles, books and data sets. In this course, Open Access refers to data sets only (Open Data).
Open Data
Data sets that can be freely used, re-used and redistributed by anyone. Open Data are typically deposited in online data repositories where they can be accessed without restrictions on reuse, possibly subject only to requirements to attribute (cite/provide credit to the data set creators) or share alike.
Open Formats
Digital file formats that are publicly documented, non-proprietary, and free from restrictions or limitations imposed by proprietary software vendors. These formats facilitate data exchange, interoperability, and long-term accessibility of research data by ensuring compatibility with a wide range of software applications and platforms.
Persistent Identifier (PID)
A long-lasting reference to a document, file, web page, or other object. In this course, PID refers to an unbreakable and actionable link associated with a digital object on the internet. Examples of persistent identifiers are Digital Object Identifiers (DOIs) typically used for journal articles and data sets, and Open Researcher and Contributor IDs (ORCIDs) to identify authors of scholarly work.
Personal data
Data that can directly or indirectly identify a person. Personal data can be divided into:
- Non-sensitive personal data, such as CV, address, date of birth, tax records.
- Information about criminal matters, such as convictions or an address in prison.
- Sensitive personal data, such as race, political views, health information, sexual orientation.
Preservation
Data preservation refers to the long-term storage of data, typically beyond the end date of the project. Data preservation includes deciding on activities and strategies to ensure the long-term accessibility, integrity, and usability of data.
Policy
A policy is a formal statement or document that outlines principles, rules, guidelines, procedures, or directives established by an institution. Policies define expectations, standards, and requirements for individuals and groups at that institution. At the UCPH, the UCPH Policy for Research Data Management defines what is expected of students and employees when managing research data.
Project Members
Researchers and students who contribute to the research conducted in a project.
ReadMe File
A simple text document (often named ReadMe.txt, or ReadMe.md) that is associated with a dataset, software project, or any collection of files. The purpose of a ReadMe file is to provide essential information about the contents, usage, and context of the data or project. It serves as a quick reference guide for your future self and for others, helping them understand and navigate the dataset or software.
Researcher
Anyone conducting or supporting research activities at the UCPH including scientific staff, PhD students, visiting and affiliate researchers among others.
Research Data
Physical material and digital data collected, observed, generated, created or reused as part of research activities conducted at UCPH. This includes any material and data that form the basis of the research, such as specimens, laboratory notebooks, interviews, texts and literature, digital raw data, audio/video recordings and computer code, as well as the detailed records of these materials and data that comprise the basis for the analysis underlying the results, such as clinical records, sequence data, spreadsheets, interview files etc.
Research Data Management (RDM)
A collective term for the planning, collecting, processing, storing, securing, sharing and archiving of primary material and research data.
Research Manager
In the UCPH Policy for Research Data Management, a research manager is defined as a researcher who is the lead researcher on a research project (principal investigator) and/or heads a research unit and/or has been given similar responsibilities by delegation.
Research Project
A project in which a researcher/student or a team of researchers/ students pursue answers to research questions by collecting information after which they analyse the information and draw conclusions from the processed information.
Research Results
Conclusions made from research data.
Risk Assessment
An analysis to assess risks to data confidentiality, integrity and accessibility. The risk assessment can be used to map which safety requirements must be complied with and which precautions must be taken to prevent breaches in confidentiality and loss of data (integrity).
Sharing
Identifying whether data are shared internally such as with other members of the research team, externally with other researchers, externally to meet funder requirements, openly to the public, or something between.
Simulation Data
Data generated by computer models to simulate real conditions, such as economical or meteorological models.
Storage
Locations and IT infrastructure where the digital files or analog specimens are kept during and after the project. Examples are a local computer, a department shared drive and cloud-based storage.
Supervisor
An experienced researcher providing guidance to a less experienced researcher or student.
Third Party
An individual, company or public body who is not employed at UCPH, and who has not entered into a collaboration agreement in which UCPH takes part.
Versioning
A process of recording and organising changes to documents, papers, books, catalogues, computer programmes, code and websites and much more. Version control helps you go back in time to see exactly who wrote what on which day and at which time and what changes were made.
Qualitative Data
Data that are non-numerical and describe qualities or characteristics. Examples are interview transcripts, responses to open-ended questionnaires, photographs or audio files.
Quantitative Data
Data that can be counted or compared on a numerical scale, such as measurements made by laboratory equipment, counts of the daily number of visitors to an exhibition, and survey data on income and spending.
This Glossary was partly compiled from the text in this course and from:
- Goben, A. and Griffen, T. (2019) In Aggregate: Trends, Needs, and Opportunities from Research Data Management Surveys. College & Research Libraries: 903-924 n.n (2021)
- Glossary. Research Management and GDPR. https://kunet.ku.dk/work-areas/research/data/glossary/Pages/default.aspx. Accessed August 2024.
- University of Copenhagen. (2022). Policy for Research Data Management (Version 1). Copenhagen: https://kunet.ku.dk/work-areas/research/data/Documents/UCPHPolicyforResearchDataManagement2022-EN.pdf. Accessed August 2024.
____________________________________________________________
Published in 2024