Skip to main content Skip to docs navigation
On this page


  • Archive: organization, place or collection that stores information for long-tern preservation such that it can be accessed and reused by a designated community.


  • Backup: practice of keeping additional copies of your research data in a separate physical or cloud locations from your files in storage.


  • Code-based data processing & analysis: giving instructions to a computer to convert raw data to machine-readable form and subsequently reusing them in order to discover useful information, inform conclusions and support decision-making.
  • Controlled vocabulary: providing a consistent way to describe data, where the terms consist of standardized and organized arrangements of words and phrases to describe domain-specific concepts. The terms are usually displayed in an alphabetical list of terms and can include subject headings, thesauri or glossaries, ontologies, and taxonomies.
  • Custom code & scripts: text-based commands used via a command-line interface.


  • Data anonymization: processing of personal data where direct and indirect personal identifiers are completely and irreversibly removed.
  • Data interoperability: ability of a dataset to work with other datasets or systems without special effort on the part of the user.
  • Data Management Plan (DMP): formal and living document to describe the data, their generation and processing during the project, as well as how the data and research results will be archived afterwards to remain available, usable and comprehensible.
  • Data pseudonymization: processing of personal data where the majority of identifying fields are replaced by pseudonyms (i.e. artificial identifiers).
  • Data repository: location where digital (and physical) objects are stored and documented, and which enable the separate publication and archiving of these objects. Data access can be either open or restricted to a group of users.
  • Digital preservation: act of ensuring continued findability and access to digital material and maintaining it independently understandable and reusable by a designated community, and with evidence supporting its authenticity, for as long as necessary.
  • Dublin Core: domain-agnostic, basic, widely used metadata standard.


  • Electronic Lab Notebook (ELN): software meant to document experiments and research data.


  • FAIR data principles: domain-agnostic, basic, widely used metadata standard.


  • git-annex: distributed file synchronization system allowing to manage large files with git, without storing the file contents in git.
  • Good Scientific Practice (GSP): principles, values and standards of behavior and practice that must be achieved and maintained in the delivery of work activities, the provision of care and personal conduct.


  • Identifiable natural person: person who can directly or indirectly be identified with the help of an (online) identifier.
  • Informed consent: process by which a subject voluntarily confirms their willingness to participate in a particular trial, after having been informed of all aspects of the trial that are relevant to the subject’s decision to participate. Informed consent is documented by means of a written, signed and dated informed consent form.


  • Literate programming: code intermingled within a narrative of the scientific analysis.


  • Metadata: data about data.


  • Narrative description: detailed, written description of computational analyses.


  • Ontology: a list of terms with curated textual definitions with persistent identifiers. The terms are arranged hierarchically from general to more specific and have defined relationships to other terms within the ontology and to external resources (e.g., synonyms, cross-references).
  • Open Science: make publicly funded research results (e.g. publications, research data, processes) publicly and freely accessible in digital formats under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.
  • Open source software: software with source code that anyone can inspect, modify, and enhance.


  • Persistent Identifier (PID): globally unique, actionable and machine-resolvable strings that act as a long-lasting reference to a digital object (e.g. a dataset).
  • Personal data: any information associated with an identified or identifiable natural person (i.e. ‘data subject’).
  • Preregistration: practice of documenting your research plan (i.e. research question and study design) before conducting a scientific investigation, and depositing that plan in a read-only public repository.
  • PUBLISSO – Repository Finder: ZB MED’s curated selection of repositories from re3data.


  • RDMO4Life: dedicated version of RDMO for all research institutions that work in the field of life sciences. Possibility of customizing questionnaires to subject/project specific needs.
  • Research Data Management Organiser (RDMO): open-source web application that supports the structured and collective planning and implementation of research data management and additionally enables the textual output of a data management plan.


  • Storage: act of keeping your research data in a secure location that you can access readily.


  • Version Control Systems (VCSs): software tools that help teams manage changes to file(s) over time.


  • Workflow management system (WMS): software tool designed to help streamline routine processes for optimal efficiency.

Further resources