Frequently Asked Questions
What types of data does Dryad accept?
Dryad accepts all research data. However, this service is intended for complete, re-usable, open research datasets.
Most types of files can be submitted (e.g., text, spreadsheets, video, photographs, code) including compressed archives of multiple files. View additional guidance on preservation-friendly file types.
- Dryad does not accept submissions that contain personally identifiable human subject information. Human subjects data must be properly anonymized and prepared under applicable legal and ethical guidelines. Please see additional guidance on human subjects data.
- Dryad does not accept any files with licensing terms that are incompatible with the Creative Commons Zero waiver. For more information, please see Why Does Dryad Use CC0?
- For software scripts and snapshots of software source code, files can be uploaded via Dryad and published at Zenodo, which allows public software deposits with version control for the ongoing maintenance of software packages. If you are only seeking to store code, software, and/or supplemental information please visit Zenodo.
What are the size limits?
There is a limit of 300GB per data publication uploaded through the web interface. We can accept larger submissions, but the submitter needs to contact us for assistance. We recommend that individual files should not exceed 10GB. This ensures files are easily accessed and downloaded by Dryad users.
How much does it cost?
Dryad is a nonprofit organization that provides long-term access to its contents at no cost to users. We are able to provide free access to data due to financial support from members and data submitters. Dryad's Data Publishing Charges (DPCs) are designed to recover the core costs of curating and preserving data.
Waivers are granted for submissions originating from researchers based in countries classified by the World Bank as low-income or lower-middle-income economies.
The base DPC per data submission is $120 USD. DPCs are invoiced upon curator approval/publication, unless the submitter is based at a member institution (determined by login credentials), an associated journal or publisher has an agreement with Dryad to sponsor the DPC (see list) or the submitter is based in a fee-waiver country (see above).
For submissions without a sponsor or waiver, Dryad charges excess storage fees for data totaling over 50GB. For data packages in excess of 50GB, submitters will be charged $50 for each additional 10GB, or part thereof (submissions between 50 and 60GB = $50 USD, between 60 and 70GB = $100 USD, and so on).
How should I prepare my data files before submitting?
Assemble all data files together and create a README as a text file that describes your data files, especially including how to work with files that are not a standard file format. Where possible data should be shared in an open file format, so proprietary software is not required to view or use the files.
- All files should be able to be opened without any passcode restrictions.
- All information needs to be in English.
- No Personal Health Information or sensitive data can be included. See tips for Human Subjects data or for sensitive species.
- Files must not contain any copyright restrictions.
- A README file that describes your data must be included.
We recommend the general use of good data practices, including descriptive names for columns and rows or file names and a logical file organization. See our recommendations for good data practices.
What should I include in my metadata?
Good metadata helps make a dataset more discoverable and reusable. The metadata should describe the data itself, rather than the study conclusions. For instance information should differ from that of an associated manuscript. A thorough description of the data file, the context in which the data were collected, the measurements that were made and the quality of the data are all important. Also see our FAQ on preparing your data.
- Journal: If associated with a manuscript, journal name and manuscript number is included.
- Title: Title should be descriptive of the dataset and relatively unique, i.e. not ‘Data files for study 12’.
- Author(s): Name, email address, institutional affiliation of main researcher(s) involved in producing the data.
- Affiliations are drawn from the Research Organization Registry (ROR)
- If you provide your co-authors' email addresses, when the dataset is published they will receive a message giving them the option to add their ORCID to the Dryad record
- Abstract: Short description of the dataset that would allow others to understand what the data is about.
- Usage notes: Information that helps others understand how to use and interpret the data, e.g. column names, units, explanations of missing data, context for the study. This information can be included in Usage Notes or as a README file uploaded with the data files. See the Cornell README template as guidance.
- Research domain: Primary research domain. Domains are drawn from the OECD Fields of Science and Technology classification.
- Keyword(s): Descriptive words that may help others discover your dataset. We recommend that you determine whether your discipline has an existing controlled vocabulary from which to choose your keywords. Please enter as many keywords as applicable.
- Methods: Any technical or methodological information that may help others to understand how the data were generated (i.e. equipment/tools/reagents used, or procedures followed)
- Usage Notes: Any technical or methodological information that may help others determine how the data may be properly re-used, replicated, or re-analyzed.
- Funding Information: Name of the funding organization that supported creation of the resource, including applicable grant number(s).
- Related works: Use this field to indicate other resources that are associated with the data. Examples include publications, other datasets, code etc.
How do I upload my files?
Files can be uploaded from your local computer or from the cloud or remote servers via a URL. Up to 300GB can be uploaded per DOI. When using a URL, Google Drive links do not work, so please choose another mechanism. If using links from GitHub, link to the individual files rather than the repository as a whole. To confirm that files have uploaded successfully, check that all files have a size greater than 0 B.
For all data files uploaded to Dryad in CSV, XLS, XLSX formats (25MB or less), a report will be automatically generated by our tabular data validator, an integration with the Frictionless python tool. This integration allows for automated data validation, focused on the format and structure of tabular data files, prior to our curation services.
If any issues are identified, a window with instructions will appear on the "Upload Files" page and a link to a detailed report will be provided in the "Tabular Data Check" column. The report will help guide you in locating and evaluating errors in your tabular data. Any files flagged in the report will need to be removed, edited, and reuploaded prior to proceeding with the submission process.
If your files have been accessed and are acceptable, "Passed" will appear in the "Tabular Data Check" column and no report will be generated. If your files have not been checked by the validator due to size or type, the "Tabular Data Check" column will be empty. In either scenario, no changes will be required and you may proceed with the submission process.
If you have questions or require assistance, contact email@example.com.
For more information regarding the Frictionless Data project at Open Knowledge Foundation, visit this link: https://frictionlessdata.io/
How does Dryad’s Private for Peer Review feature work?
On the final page of the submission process, we offer the option to make the dataset private during your related manuscript’s peer review process. After selecting this option, you will be presented with a private, randomized URL that allows for a double-blind download of the dataset. This link can be used by the journal office to access the data files during the review period or shared with collaborators while the dataset is not yet published. When your manuscript has been accepted, you can take your dataset out of private for peer review, so that the Dryad team can begin the curation and publication processes. To do this, log in to Dryad and navigate to "My Datasets". Find the submission with the status "Private for Peer Review" and click 'Update'. Deselect the "Private for Peer Review" checkbox on the 'Review and Submit' page. At the bottom of this page, click ‘Submit'.
When should I submit my data?
Data may be submitted and published at any time. However, if your data are associated with a journal article, there may be special considerations:
- Journals that are integrated with Dryad have specific requirements. Look up your journal to determine the proper workflow.
- If you received an invitation from a journal to submit data to Dryad, then that journal has integrated its submission process with Dryad. Please follow the instructions from the journal.
- If a delayed-release data embargo is allowed by your journal, you may request that (firstname.lastname@example.org) at the time of submission.
- Regardless of journal, you may choose to make your data temporarily Private for Peer Review.
What happens after I submit my data?
Dryad is a curated repository. We perform basic checks on each submission through curation. If our curators have questions or suggestions about your submission, they will contact you directly. Otherwise you will be notified when your dataset is approved.
If your data submission is private for peer review it will not be processed by our curators until the associated manuscript is accepted.
Upon curator approval, the Dryad DOI is officially registered and, if applicable, the Data Publishing Charge (DPC) and any overage fees are invoiced.
After data publication, if you have edits, additional files, or subsequent related work we recommend versioning your data by using the "update" link. All versions of a dataset will be accessible, but the dataset DOI will always resolve to the newest version.
What happens during curation?
Dryad has a team of curators who check every submission to ensure the validity of files and metadata. Once your data is submitted, Dryad curators perform basic checks. As an author, you can review these for your dataset. Assuring that your dataset meets all of our requirements for metadata and data files will ensure that the curation process is as efficient and timely as possible.
If Dryad curators identify questions, problems, or areas for improvement, they will contact you directly via the email address associated with your submission. You may contact the curation team for questions or consultations at email@example.com
How do Dryad & Zenodo partner and integrate?
Dryad formed a partnership with Zenodo, a multidisciplinary repository based at CERN, in 2019. This partnership leverages each organization's strengths: data curation at Dryad and software publication at Zenodo.
Through our integration, any software uploaded during the data submission process will be triaged and published at Zenodo. The software will not go through Dryad curation processes but it will be time-released with the publication of the Dryad dataset. Both the data and software packages will be linked and denoted on the Dryad landing page under “Related Works”.
Dryad stores a copy of all datasets in Zenodo for enhanced preservation services.
How are the datasets discoverable?
All datasets will be indexed by the Thomson-Reuters Data Citation Index, Scopus, and Google Dataset Search. Each dataset is given a unique Digital Object Identifier or DOI. Entering the DOI URL in any browser will take the user to the dataset's landing page. Dryad also provides a faceted search and browse capability for direct discovery.
Dryad has implemented the Make Data Count project recommendations. This means that that views and downloads on each dataset landing page are standardized against the COUNTER Code of Practice for Research Data. Within this framework, Dryad also exposes all related citations to a dataset on the landing page. These are updated each time a new citation from an article or other source has been published.
Ways you can ensure your data publication has the broadest reach:
- Comprehensive documentation (i.e. metadata) is the key for discoverability as well as ensuring future researchers understand the data. Without thorough metadata (description of the context of the data file, the context in which the data were collected, the measurements that were made, and the quality of the data), the data cannot be found through internet searches or data indexing services, understood by fellow researchers, or effectively used. We require a few key pieces of metadata. Additional information can be included in the "Usage Notes" section of the description, or as a separate readme.txt file archived alongside the dataset files. The metadata entry form is based on fields from the DataCite schema and is broadly applicable to data from any field.
- Cite and publicize your data publication with your given DOI assigned upon submission. The recommended citation format appears on your dataset landing page.
How are the datasets preserved?
Data deposited are permanently archived and available through the California Digital Library's Merritt Repository. For a full description of the services provided by Merritt, see this document: UC3, Merritt, and Long-term preservation.
Preservation policy details include:
- Retention period: Items will be retained indefinitely
- Functional preservation: We make no promises of usability and understandability of deposited objects over time.
- File preservation: Data files are replicated with multiple copies in multiple geographic locations; metadata are backed up on a nightly basis.
- Fixity and authenticity: All data files are stored along with a SHA-256 checksum of the file content. Regular checks of files against their checksums are made. The audit process cycles continually, with a current cycle time of approximately two months.
- Succession plans: In case of closure of the repository, reasonable efforts will be made to integrate all content into suitable alternative institutional and/or subject based repositories.
How can I update my data?
You can update your data at any time by clicking on the 'Update' link for your dataset. Any edits made will create a new version of your submission, however the DOI will remain the same. Once the latest version has been approved by our curation team and published, only the most recent version of your dataset will be packaged and available for download via the ‘Download Dataset’ button. Prior versions can be accessed via the ‘Data Files’ section which is organized by the date of publication.
Can I delete my data?
Data deposited in Dryad is intended to remain permanently archived and available. Removal of a deposited dataset is considered an exceptional action which should be requested and fully justified by the original contributor (e.g., if there are concerns over privacy or data ownership). To request the withdrawal of data from Dryad, contact firstname.lastname@example.org.
How can I best construct my search terms when exploring data at Dryad?
When searching in the Dryad user interface, the normal behavior is to
treat each search term as being combined by AND. A search for
dog will return only datasets that contain both
Search terms may have a wildcard
* appended. A search for
will return datasets that contain
Search terms may be negated with a minus sign. A search for
will return datasets that contain
cat, but exclude any datasets that
Phrases may be searched by using quotes. A search for
"dog my cats"
will only return datasets that contain this specific phrase, and not
datasets that contain the individual terms.
Proximity searches may be performed. To find datasets containing
dog within four words of
cat, search for
Searches may also be further constrained by the filters displayed on the left side of the search results screen.
All data in Dryad is released into the public domain under the terms of a Creative Commons Zero (CC0) waiver. CC0 was crafted specifically to reduce any legal and technical impediments, be they intentional and unintentional, to the reuse of data. Importantly, CC0 does not exempt those who reuse the data from following community norms for scholarly communication. It does not exempt researchers from reusing the data in a way that is mindful of its limitations. Nor does it exempt researchers from the obligation of citing the original data authors. CC0 facilitates the discovery, re-use, and citation of that data. For more information see a post on Dryad’s blog as well as University of California’s Office of Scholarly Communications blog.
How do I cite my data?
As soon as you start a data submission a DOI is reserved for that dataset and is in the format https://doi.org/10.5061/dryad.XXXX. This, and the title and author information, is included in the Citation section of a published dataset and the notification emails you receive from Dryad. If you need the DOI before you submit your dataset, for instance to include in a manuscript submission, you can find the DOI on the ‘Review and Submit’ page under ‘Review Description’.