Data Repositories
A data repository is a location where digital objects are stored and documented, and which enable the separate publication and archiving of these objects. Data access can be either open or restricted to a group of users.
Benefits of publishing data in a repository
- Back-up (i.e. the practice of keeping additional copies of your research data in a separate physical or cloud locations from your files in storage).
- Increased data discoverability and accessibility, and thus potential data reuse.
- Increased visibility and citations as repositories can be harvested by search engines such as:
Criteria to select a trusted repository
Below are listed criteria you might want to consider when selecting a repository where to publish your data:
- Is the repository certified according to CoreTrustSeal (CTS), nestor seal/DIN 31644 or ISO 16363? Find these repositories via re3data or the website of various certification initatives:
- Find CTS-certified repositories on re3data
- Find nestor/DIN 31644-certified repositories on re3data
- Find ISO16363-certified repositories on re3data
- Find a CTS-certified repository on the CTS-Website
- Find a nestor/DIN 31644-certified repository on the nestor-Website
- Find an ISO16363-certified repository on the CTAB-Website
- Is the repository among the repositories preferred by a funder and/or endorsed by the international research community? In general, community-endorsed repositories should be well known, well-used and have a good continuity plan. If they existed for a long time and contain a lot of published data, their long-term sustainability is likely covered.
- Does the repository provide:
- Open Access to non-sensitive data?
- The terms of use and licenses of the data?
- A policy to help researchers determine whether their data needs will be met?
- Does the repository use a Persistent Identifier (PID) system such as assigning a Digital Object Identifier (DOI) to submitted datasets?
- Does the repository offer the possibility to integrate all metadata relevant for finding and specifying your publication?
- Further criteria: costs, repository size, data upload restriction(s), landing pages, guidance on data citation.
Order of preference for selecting a repository
- A certified resposity (CoreTrustSeal, nestor/DIN 31644, ISO 16363)
- A well-established repository in your discipline that offers a specific scope (e. g. supporting a specific metadata schema).
- One of the repositories recommended by your funding organization or the funding program.
- Your institution’s repository (if available).
- A cost-free interdisciplinary repository (e.g. Figshare, Zenodo).
- Another repository that you can search for using the above-mentioned criteria in a repository finder.
Well-established repositories for data deposition in microbiology
Below are listed well-established repositories in microbiology. For each repository, the FAIRsharing and re3data pages are linked. On the FAIRsharing page, you will find information such as which journals endorse the repository (under “Collections & Recommendations” and then “In Policies”). On the re3data page, you will find information such as the above-mentioned criteria to select a trusted repoository.
Data type | Data repository | FAIRsharing | re3data |
---|---|---|---|
All research output | Zenodo | FAIRsharing | re3data |
Crystallographic data | Cambridge Structural Database (CSD) | FAIRsharing | re3data |
Geospatial data | Malaria Atlas Project (MAP) | / | re3data |
Image data | BioImage Archive | FAIRsharing | re3data |
Cell Image Library | FAIRsharing | re3data | |
Linked genotype and phenotype data | European Genome-phenome Archive (EGA) | FAIRsharing | re3data |
Macromolecular structures | Worldwide Protein Data Bank (wwPDB) | FAIRsharing | re3data |
RCSB Protein Data Bank (RCSB PDB) | FAIRsharing | re3data | |
Protein Data Bank of Japan (PDBj) | FAIRsharing | re3data | |
Protein Data Bank of Europe (PDBe) | FAIRsharing | re3data | |
Biological Magnetic Resonance Data Bank (BMRB) | FAIRsharing | re3data | |
Electron microscopy data | Electron Microscopy Data Bank (EMDB) | FAIRsharing | re3data |
Electron Microscopy Public Image Archive (EMPIAR) | FAIRsharing | re3data | |
Microbiome data | MGnify | FAIRsharing | re3data |
Nucleic acid sequences | GenBank | FAIRsharing | re3data |
DNA Data Bank of Japan (DDBJ) | FAIRsharing | re3data | |
European Nucleotide Archive (ENA) | FAIRsharing | re3data | |
Sequence Read Archive (SRA) | FAIRsharing | re3data | |
Genome Sequence Archive (GSA) | FAIRsharing | re3data | |
Genetic polymorphism | European Variation Archive (EVA) | FAIRsharing | re3data |
Functional genomics | Gene Expression Omnibus (GEO) | FAIRsharing | re3data |
ArrayExpress | FAIRsharing | re3data | |
GenomeRNAi | / | re3data | |
European Genome-phenome Archive (EGA) | FAIRsharing | re3data | |
Database of Interacting Proteins (DIP) | FAIRsharing | re3data | |
IntAct | FAIRsharing | re3data | |
Japanese Genotype-phenotype Archive (JGA) | FAIRsharing | re3data | |
PubChem | FAIRsharing | re3data | |
Genomic Expression Archive (GEA) | FAIRsharing | re3data | |
Genome-Wide Association Studies (GWAS) Catalog | FAIRsharing | / | |
Protein sequences | UniProt | FAIRsharing | re3data |
Proteomes | PRoteomics IDEntifications (PRIDE) Archive database | FAIRsharing | re3data |
Quantitative and predictive food microbiology | ComBase | / | re3data |
Scientific texts and data | PUBLISSO – Repository for Life Sciences | FAIRsharing | re3data |
Species interaction data | Global Biotic Interactions (GloBI) | / | / |
Standardized bacterial information | Bacterial Diversity Metadatabase (BacDive) | FAIRsharing | re3data |
Vertebrate-virus network | VIRION | / | / |
Data publishing in the PUBLISSO – Repository for Life Sciences (FRL)
Repository details
- Data types: scientific texts and research data from the fields of medicine, health, nutritional, environmental and agricultural sciences
- Open Access: yes
- Terms of use & License: yes
- PID system: yes (DOI)
- Certification or repository standard: no
- Policy: yes ( see “Data Policy”, only in German)
- Archiving: yes
- Costs: free of charge
- Data curation & quality control: no
- Data guarantee: availability for at least 10 years and transfer to a long-term archive
- Recommended publication formats for research data: see “Recommended preservation formats for research data” on this page
Steps to deposit data in the FRL
- Contact forschungsdaten@zbmed.de.
- You will receive:
- The terms of use that you need to sign and send back.
- A record sheet that you need to fill in with your metadata and send back.
- Your metadata will be checked, entered into the FRL and assigned a DOI.
- Send your data and supplementary materials to the FRL team.
- Your data will be added to the metadata record already created in the FRL.
- You will receive a publication notice with the DOI.
Benefits
- The FRL provides Open Access to its data (in specific cases, you can set an embargo period of up to 24 months).
- There are no charges to publish, archive or use scientific texts and research data from the FRL.
- The FRL supports the FAIR data principles (e.g. by assigning DOIs, by offering life science-specific metadata).
- You can publish your metadata in the FRL and receive a DOI in advance of publishing your data.
- You can publish large datasets in the FRL.
- The FRL is permanently accessible online.
- The FRL is indexed in re3data.
- The FRL is findable via Google search and indexed in BASE, DataCite Search and LIVIVO - ZB MED Search Portal for Life Sciences.
Drawbacks
- The FRL is not suitable for sensitive data.
- The FRL can be browsed only in German.
Data publishing in Zenodo
Repository details
- Data types: all research outputs, positive and negative results
- Open Access: yes
- Terms of use & License: yes
- PID system: yes (DOI)
- Certification or repository standard: no
- Policy: yes (see here)
- Archiving: yes
- Costs: free of charge
- Data curation & quality control: by the repository itself or by communities
- Data guarantee: availability for the lifetime of the repository
- Recommended publication formats for research data: any file format
Steps to deposit data in Zenodo
- Upload files.
- Describe your content so others can find it.
- Publish your content.
For more details, see this guide.
Benefits
- Zenodo provides Open Access to its data (files may be deposited under closed, open or embargoed access).
- There are no charges to publish, archive or use content.
- Zenodo supports the FAIR data principles (e.g. by assigning DOIs).
- Zenodo is permanently accessible online.
- Zenodo is indexed in re3data.
- Zenodo is findable via Google search and indexed in BASE and DataCite Search.
Drawbacks
- Zenodo might not be suitable for sensitive data that might have to stay in their country of origin.
- Zenodo has a size limit of 50 GB per record. For bigger size, a request has to be made.
Repository finders
- To find any repositories:
- The DFG-funded registry of research data repositories (re3data.org, search results for “Microbiology”)
- The RDA-endorsed FAIRsharing.org (search results for Databases in Microbiology)
- To find Open Access repositories in the life sciences: ELIXIR Deposition Databases for Biomolecular Data
- To find a suitable interdisciplinary repository: Generalist Repository Comparison Chart
- To find Open Access repositories: OpenDOAR: Directory of Open Access Repositories
Further Resources
- Data Deposition and Standardization help page of the Oxford Academic Nucleic Acids Research (NAR Journal).
Get Help
If you have any further questions about the management and analysis of your microbial research data, please contact us: helpdesk@nfdi4microbiota.de (by emailing us you agree to the privacy policy on our website: Contact)
References
- Engelhardt, C., Biernacka, K., Coffey, A., Cornet, R., Danciu, A., Demchenko, Y., Downes, S., Erdmann, C., Garbuglia, F., Germer, K., Helbig, K., Hellström, M., Hettne, K., Hibbert, D., Jetten, M., Karimova, Y., Kryger Hansen, K., Kuusniemi, M. E., Letizia, V., … Zhou, B. (2022). D7.4 How to be FAIR with your data. A teaching and training handbook for higher education institutions (V1.2.1). Zenodo. https://doi.org/10.5281/ZENODO.6674301
- Lindlar, M., Rudnik, P., Horton, L., & Jones, S. (2020). “You say potato, I say potato” - Mapping Digital Preservation and Research Data Management Concepts towards Collective Curation and Preservation Strategies. https://doi.org/10.5281/ZENODO.3672773
- Rathmann T. et al. 2021-10. Workshop on Research Data Management. FoDaKo and ZB MED - Information Centre for Life Sciences. Google Slides.
Citation
National Research Data Infrastructure for Microbiota Research (NFDI4Microbiota). (2024, September 30). Data Repositories. NFDI4Microbiota Knowledge Base.