Skip to main content Skip to docs navigation

Digital Preservation

On this page


Digital preservation means taking certain measures to ensure that digital material can be found and can be accessed in the long term (“long-term accessibility of data”). It aims to preserve information in a way that is understandable and reusable for a specific community and to prove its authenticity.

Digital preservation for researchers

The sustainable handling of data by researchers naturally facilitates the long-term accessibility of data. Best practice methods are:

  • Cleaning data / data structures - see also: Data Organisation
  • Validating data - see also: Data Quality Control
  • Documenting data with metadata and context information to ensure reusability: commenting, adding descriptive, administrative and technical metadata, asigning user license.
  • Using well-known open file formats during the project phase - see below - or transfering data into reusable file formats (needs documenting: original file or derivative)
  • Storing data following the 3-2-1 rule:
    • Keeping 3 copies of any important file
    • Storing files on 2 different media types
    • Keeping at least 1 copy off site.

Data selection

To decide well-founded on data selection we recommend reading the how-to guide of the Edinburgh Digital Curation Centre (DCC, 2014). The suggested steps are:

  • Step 1: Identify purposes that the data could fulfill: consider the purpose or ‘reuse case’ of your data, including reuse outside your research group.
  • Step 2: Identify data that must be kept: consider legal or policy compliance risks, as well as funder requirements.
  • Step 3: Identify data that should be kept: as it may have long-term value.
  • Step 4: Weigh up the costs and identify any need for external advice in case of shortfall in the budget.
  • Step 5: Complete the data appraisal, i.e. list what data must, should or could be kept to fulfill which potential reuse purposes. Summarize any actions needed to prepare the data for deposit - or justification for not keeping it.

Making your research available in recommended file formats additional to the original software format supports highly the reusability and long-term accessibility of your data. Attributes of those file formats are:

  • Open rather than proprietary (examples for open files formats)
  • Well-documented
  • In widespread use
  • Simple (e.g. csv rather than xlsx)
  • Text-based (i.e. any file you can open with a text editor and read) rather than binary (e.g. txt files rather than doc files)
  • Exportable to / unpackable into an open format (e.g. xlsx, docx, etc. can be unpacked into folders of xml files)
  • Machine-readable

For biomaterial data, recommended formats are CSV, TXT and XML.

Digital preservation for repository operators

Specific preservation measures depend on the digital objects, needs of the user community, and various other conditions. Repositories usually contain publications as files, making file format identification and validation relevant.

Bitstream preservation

Preservation on the bitstream level is the basis for digital preservation. It covers e. g.

  • Checking checksums of transferred files upon receiving them (or generating file checksums) and conducting regular fixity checks
  • Redundant storage of data
  • Generating backups (e. g. offline backups of the underlying repository database)
  • Strategies for updating storage media (according to e. g. server lifetime)

Preservation beyond bitstream

Preservation of file content, being able to open and render it correctly in a software is part of logical (Lindlar et al., 2020) or technical preservation, also called digital curation. Semantic preservation is concerned with e. g. semantic drift impacting metadata.

  • Obtaining sufficient rights allowing e. g. format migrations, file repairs and re-use over the long-term like re-publication in other infrastructures
  • File format identification, based format-specific bit patterns, e. g. via DROID during publication process
  • File format validation, based on format specifications, e. g. using XML validators during publication process (for validators see also COPTR)
  • Automated extraction of technical metadata from files (see also COPTR)
  • Virus scans
  • Replacing files with problems (e. g. invalid files) as early as possible
  • Obtaining sufficient metadata
    • storing unique file identifiers, machine-readable version information and relations between files
    • indexing rights information in a machine usable way
    • indexing of identification-, validation-, metadata extraction- and virus check-output
    • preserving and updating of descriptive metadata, according to user community needs
  • Migrating at-risk files, e. g. files with obsolete formats (see also migration tools)
  • Providing versioning of files and publications and possibility to rollback to earlier versions

Many digital preservation criteria applying to repositories are also present in the certification criteria of the CoreTrustSeal and the nestor seal (Standards & Board, 2022; Harmsen et al., 2013).


  1. DCC. (2014). Five steps to decide what data to keep: a checklist for appraising research data v.1. In Edinburgh: Digital Curation Centre.
  2. Lindlar, M., Rudnik, P., Horton, L., & Jones, S. (2020). "You say potato, I say potato" - Mapping Digital Preservation and Research Data Management Concepts towards Collective Curation and Preservation Strategies. 15(1).
  3. Standards, C. T. S., & Board, C. (2022). CoreTrustSeal Requirements 2023-2025 (Version V01.00). Zenodo.
  4. Harmsen, H., Keitel, C., Schmidt, C., Schoger, A., Schrimpf, S., Stürzlinger, M., & Wolf, S. (2013). Explanatory notes on the nestor Seal for Trusworthy Digital Archives. nestor Certification Working Group (Vol. 17).
` const jsSnippetContent = jsSnippet ? '\/\/ NOTICE!!! Initially embedded in our docs this JavaScript\n\/\/ file contains elements that can help you create reproducible\n\/\/ use cases in StackBlitz for instance.\n\/\/ In a real project please adapt this content to your needs.\n\/\/ \u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\u002b\n\n\/*!\n * JavaScript for Bootstrap\u0027s docs (https:\/\/\/)\n * Copyright 2011-2023 The Bootstrap Authors\n * Licensed under the Creative Commons Attribution 3.0 Unported License.\n * For details, see https:\/\/\/licenses\/by\/3.0\/.\n *\/\n\n\/* global bootstrap: false *\/\n\n(() =\u003e {\n \u0027use strict\u0027\n\n \/\/ --------\n \/\/ Tooltips\n \/\/ --------\n \/\/ Instantiate all tooltips in a docs or StackBlitz\n document.querySelectorAll(\u0027[data-bs-toggle=\u0022tooltip\u0022]\u0027)\n .forEach(tooltip =\u003e {\n new bootstrap.Tooltip(tooltip)\n })\n\n \/\/ --------\n \/\/ Popovers\n \/\/ --------\n \/\/ Instantiate all popovers in docs or StackBlitz\n document.querySelectorAll(\u0027[data-bs-toggle=\u0022popover\u0022]\u0027)\n .forEach(popover =\u003e {\n new bootstrap.Popover(popover)\n })\n\n \/\/ -------------------------------\n \/\/ Toasts\n \/\/ -------------------------------\n \/\/ Used by \u0027Placement\u0027 example in docs or StackBlitz\n const toastPlacement = document.getElementById(\u0027toastPlacement\u0027)\n if (toastPlacement) {\n document.getElementById(\u0027selectToastPlacement\u0027).addEventListener(\u0027change\u0027, function () {\n if (!toastPlacement.dataset.originalClass) {\n toastPlacement.dataset.originalClass = toastPlacement.className\n }\n\n toastPlacement.className = \u0060${toastPlacement.dataset.originalClass} ${this.value}\u0060\n })\n }\n\n \/\/ Instantiate all toasts in docs pages only\n document.querySelectorAll(\ .toast\u0027)\n .forEach(toastNode =\u003e {\n const toast = new bootstrap.Toast(toastNode, {\n autohide: false\n })\n\n\n })\n\n \/\/ Instantiate all toasts in docs pages only\n \/\/ js-docs-start live-toast\n const toastTrigger = document.getElementById(\u0027liveToastBtn\u0027)\n const toastLiveExample = document.getElementById(\u0027liveToast\u0027)\n\n if (toastTrigger) {\n const toastBootstrap = bootstrap.Toast.getOrCreateInstance(toastLiveExample)\n toastTrigger.addEventListener(\u0027click\u0027, () =\u003e {\n\n })\n }\n \/\/ js-docs-end live-toast\n\n \/\/ -------------------------------\n \/\/ Alerts\n \/\/ -------------------------------\n \/\/ Used in \u0027Show live alert\u0027 example in docs or StackBlitz\n\n \/\/ js-docs-start live-alert\n const alertPlaceholder = document.getElementById(\u0027liveAlertPlaceholder\u0027)\n const appendAlert = (message, type) =\u003e {\n const wrapper = document.createElement(\u0027div\u0027)\n wrapper.innerHTML = [\n \u0060\u003cdiv class=\u0022alert alert-${type} alert-dismissible\u0022 role=\u0022alert\u0022\u003e\u0060,\n \u0060 \u003cdiv\u003e${message}\u003c\/div\u003e\u0060,\n \u0027 \u003cbutton type=\u0022button\u0022 class=\u0022btn-close\u0022 data-bs-dismiss=\u0022alert\u0022 aria-label=\u0022Close\u0022\u003e\u003c\/button\u003e\u0027,\n \u0027\u003c\/div\u003e\u0027\n ].join(\u0027\u0027)\n\n alertPlaceholder.append(wrapper)\n }\n\n const alertTrigger = document.getElementById(\u0027liveAlertBtn\u0027)\n if (alertTrigger) {\n alertTrigger.addEventListener(\u0027click\u0027, () =\u003e {\n appendAlert(\u0027Nice, you triggered this alert message!\u0027, \u0027success\u0027)\n })\n }\n \/\/ js-docs-end live-alert\n\n \/\/ --------\n \/\/ Carousels\n \/\/ --------\n \/\/ Instantiate all non-autoplaying carousels in docs or StackBlitz\n document.querySelectorAll(\u0027.carousel:not([data-bs-ride=\u0022carousel\u0022])\u0027)\n .forEach(carousel =\u003e {\n bootstrap.Carousel.getOrCreateInstance(carousel)\n })\n\n \/\/ -------------------------------\n \/\/ Checks \u0026 Radios\n \/\/ -------------------------------\n \/\/ Indeterminate checkbox example in docs and StackBlitz\n document.querySelectorAll(\ [type=\u0022checkbox\u0022]\u0027)\n .forEach(checkbox =\u003e {\n if (\u0027Indeterminate\u0027)) {\n checkbox.indeterminate = true\n }\n })\n\n \/\/ -------------------------------\n \/\/ Links\n \/\/ -------------------------------\n \/\/ Disable empty links in docs examples only\n document.querySelectorAll(\ [href=\u0022#\u0022]\u0027)\n .forEach(link =\u003e {\n link.addEventListener(\u0027click\u0027, event =\u003e {\n event.preventDefault()\n })\n })\n\n \/\/ -------------------------------\n \/\/ Modal\n \/\/ -------------------------------\n \/\/ Modal \u0027Varying modal content\u0027 example in docs and StackBlitz\n \/\/ js-docs-start varying-modal-content\n const exampleModal = document.getElementById(\u0027exampleModal\u0027)\n if (exampleModal) {\n exampleModal.addEventListener(\\u0027, event =\u003e {\n \/\/ Button that triggered the modal\n const button = event.relatedTarget\n \/\/ Extract info from data-bs-* attributes\n const recipient = button.getAttribute(\u0027data-bs-whatever\u0027)\n \/\/ If necessary, you could initiate an Ajax request here\n \/\/ and then do the updating in a callback.\n\n \/\/ Update the modal\u0027s content.\n const modalTitle = exampleModal.querySelector(\u0027.modal-title\u0027)\n const modalBodyInput = exampleModal.querySelector(\u0027.modal-body input\u0027)\n\n modalTitle.textContent = \u0060New message to ${recipient}\u0060\n modalBodyInput.value = recipient\n })\n }\n \/\/ js-docs-end varying-modal-content\n\n \/\/ -------------------------------\n \/\/ Offcanvas\n \/\/ -------------------------------\n \/\/ \u0027Offcanvas components\u0027 example in docs only\n const myOffcanvas = document.querySelectorAll(\ .offcanvas\u0027)\n if (myOffcanvas) {\n myOffcanvas.forEach(offcanvas =\u003e {\n offcanvas.addEventListener(\\u0027, event =\u003e {\n event.preventDefault()\n }, false)\n })\n }\n})()\n' : null const project = { files: { 'index.html': markup, 'index.js': jsSnippetContent }, title: 'Bootstrap Example', description: `Official example from ${window.location.href}`, template: jsSnippet ? 'javascript' : 'html', tags: ['bootstrap'] } StackBlitzSDK.openProject(project, { openFile: 'index.html' }) }