Survey data from WDS subcommittee on certifications and accreditations used by scientific data repositories
Data files
Mar 27, 2026 version files 63.15 KB
-
README.md
6.06 KB
-
WDS_Standards_Certifications_AnswerChoices.csv
6.01 KB
-
WDS_Standards_Certifications_SurveyData.csv
51.08 KB
Abstract
The World Data System (WDS) is committed to fostering and supporting a global network of members focused on scientific data repositories and data stewardship. As repositories navigate evolving technologies, growing data volumes and types, changing user expectations, and closer integration with scientific workflows—including linking data to scholarly publications, other data sources, and software to process the data—WDS works to address the challenges they face. A central role of WDS is to collaborate on developing best practices, including fostering policies, standards, and certifications that ensure the reliability and trustworthiness of scientific data repositories. WDS membership currently requires maintaining CoreTrustSeal (CTS) certification. However, many members (and prospective members) have pointed out that the certification process is complex and time-consuming.
In response to these concerns, the Scientific Committee of WDS formed a subcommittee to examine how extended certification pathways (alternative certification methods) can reduce barriers to entry and foster a more inclusive growth of the WDS membership. To this end, a survey was conducted to identify the certifications and accreditations currently used by data repositories. The survey was also intended to explore avenues for extended certification processes that could make WDS membership more inclusive by lowering entry barriers. The survey focused on international certifications, funder-mandated certifications, and disciplinary accreditations, emphasizing trustworthy digital repository standards rather than disciplinary interoperability standards.
The survey was broadly distributed and engaged a significant number of stakeholders worldwide. Results showed a general interest in alternative certification pathways for WDS membership and specified concerns with certification hurdles.
Dataset DOI: 10.5061/dryad.dz08kps8c
Description of the data and file structure
In November of 2023, the Executive Committee of the World Data System (WDS) endorsed the creation of a dedicated subcommittee focused on standards and certifications. This initiative was aligned with WDS's commitment to enhancing the quality and governance of WDS member repositories. The subcommittee's task was to conduct a thorough evaluation of existing and emergent national and international standards, as well as certifications for data management practices. Surveying the WDS membership and the broader data field experts was considered an important step in gathering detailed information about the range of standards and certifications implemented or required by governing entities and funding organizations globally. This dataset is the result of that online survey.
Files and variables
Survey data was collected between 12 June and 14 August 2024, using Qualtrics software. The data was exported in a comma-separated values (CSV) format, along with metadata provided by Qualtrics in a single data file: WDS_Standards_Certifications_SurveyData.csv.
Each row in the dataset corresponds to a survey entry. The first three rows at the top of the dataset are "Titles" as exported from Qualtrics:
- Row 1: Qualtrics numbering by survey block
- Row2: Question wording
- Row 3: Qualtrics metadata
The data file includes several empty/blank cells; these correspond to questions intentionally not answered by the respondents or not applicable to specific respondents based on question dependencies.
In addition to the primary survey data export, a separate file is provided to document the fixed response options used in the survey:
WDS_Standards_Certifications_SurveyData.csv — primary survey response data export from Qualtrics.
WDS_Standards_Certifications_AnswerChoices.csv — codebook of questions with fixed answer choices, listing the exact response options used. This file supports consistent interpretation of responses in the main data file.
Data anonymization and privacy protections
To protect respondent privacy and to ensure the shared dataset does not contain information that could directly identify participants, the following steps were taken prior to publication:
Removal of date/time fields: Columns containing exact date/time values related to survey activity—such as survey start time, survey end time, and timestamps indicating when responses were recorded were removed to support anonymization.
Repository name field de-identification: In the column where respondents were asked to provide the name of their repository, a small number of respondents entered personally identifying information. In those cases, the identifying text was replaced with “anonymized” in the shared dataset.
No respondent contact information is included in the public dataset.
Skip logic and display logic (Qualtrics branching)
Some questions were displayed conditionally based on responses to prior questions. As a result, blank cells may appear in the dataset when a respondent was not shown a question due to survey logic or when a question was not applicable.
The following skip/display logic applies (listed using the column letters referenced in the dataset export):
Column N — “Does your repository have CoreTrustSeal (CTS) certification?” (single choice)
If the response is “No, we never held a CTS certification”, the survey skips to Q3.5 (per survey logic).
Column O — “Will you renew your expired CoreTrustSeal (CTS) certification?” (single choice)
Displayed only if Column N = “No, our certification is currently expired”.
Column P — “Please comment on why you are not renewing CTS certification.” (text entry)
Displayed only if Column O = “No”.
Questions with fixed answer choices (overview)
Fixed response options are documented in: WDS_Standards&Certifications_AnswerChoices.csv
As a brief reference, examples of fixed-choice questions include:
Column H — Consent (single choice): I agree / I do NOT agree
Column L — Country (dropdown): list as displayed in Qualtrics (see codebook)
Column M — WDS Member (single choice): Yes / No
Column N — CTS certification status (single choice): listed in codebook
Column Q — Issues experienced (multiple-select): listed in codebook
Column S — Government/funder-required certifications (single choice): Yes / No / Unsure
Column U — Extend WDS membership beyond CTS (single choice): Yes / No
Column AB — Willing to provide contact info for follow-up (single choice): Yes / No
Code/software
Any free or open-source software that supports CSV files can be used to view and analyze the dataset.
Human subjects data
The distribution of this survey and procedures for data collection and analysis were reviewed and approved by the Institutional Review Board (IRB) at the University of Tennessee, Knoxville (UTK). The IRB approval number is UTK IRB-24-08222-XM, and the approval date is 06/28/2024, ensuring compliance with ethical standards and protection of human subjects. Consent was secured from all survey respondents, including the following statement: "As advocates for Open Science, we may share your research data with other researchers without asking for your consent again, but it will not contain information that could directly identify you. A long-term data sharing and preservation plan will be used to store and make the data publicly accessible beyond the life of the project. The data will be deposited into Dryad, an open data publishing platform committed to the open availability and routine re-use of all research data."
Contact information from willing respondents was collected, but it is not shared or made available to protect the respondents' privacy. Survey questions containing information considered sensitive are not included in the shared dataset.
The survey call was posted on WDS's social media and newsletter and disseminated further through partnerships with organizations such as the Research Data Alliance (RDA), the International Science Council (ISC), and the ISC’s Committee on Data (CODATA). The survey was open for nine weeks, and data were collected using Qualtrics software. This survey was reviewed and approved by the Institutional Review Board (IRB) at the University of Tennessee, Knoxville (UTK). The IRB approval number is UTK IRB-24-08222-XM, and the approval date is 06/28/2024, ensuring compliance with ethical standards and protection of human subjects.
The qualitative data, obtained through Qualtrics software, was exported in a comma-separated values (CSV) format, along with metadata provided by Qualtrics, allowing for subsequent analysis using specific statistical methods tailored to understand patterns within responses received.
Contact information from respondents who provided consent was collected separately to maintain their privacy; this information is not shared or publicly disclosed. To further protect respondent confidentiality, sensitive questions—specifically questions 7, 8, 9, 13, and 14—are excluded from the shared dataset. All participants were informed about the use of their data and provided with options regarding their level of participation per IRB standards. Secondary use of the anonymized dataset faces minimal restrictions as per CC0 license.
