SciStarter: exploring project connections across the citizen science landscape: a social network analysis of shared volunteers
Data files
Jan 06, 2025 version files 324.20 KB
-
README.md
3.63 KB
-
SciStarter_ProjectAttributeData_2020.csv
50.23 KB
-
SciStarter_SNAProjectEdgelist_2020.csv
270.34 KB
Abstract
Research on citizen science volunteers has historically focused on single projects, but emerging research suggests many volunteers engage in multiple projects. Platforms that host thousands of projects, like SciStarter.org, enable the exploration of volunteer activity across multiple projects. To learn more about the phenomenon of multi-project engagement, we carried out a descriptive social network analysis using digital trace data depicting volunteer activity on SciStarter.org from 2017 to 2018. During this time period, our sample included 624 citizen science projects and 3,650 unique volunteers who engaged in these projects. We used these data to visualize and analyze project connection networks formed when volunteers join multiple projects. Volunteers joined an average of 2.93 projects spanning many different scientific disciplines (e.g., topics such as Health & Medicine, Ecology & Environment) and modes of participation (e.g., online, offline); 73% of volunteers joined 2 or more projects. Volunteer engagement in citizen science produced a complex network of project connections with low network centrality, low levels of homophily and clustering, and ample evidence of boundary spanning (e.g. based on topic or mode). The projects most central in the network, which were also the most popular, were those featured as affiliates on the website or in promotional email campaigns. By using a network approach to analyze digital trace data, our research illustrates the extent of multi-project, multi-disciplinary engagement on a third-party platform, laying the groundwork for researchers and platform managers to explore and facilitate multi-project engagement and its implications for the larger field of citizen science.
https://doi.org/10.5061/dryad.dfn2z34z3
Description of the data and file structure
This dataset consists of digital trace data extracted from SciStarter.org (which existed as SciStarter.com at the time data collection was initiated). On Dec 6th, 2018, our research team received anonymized digital trace data of SciStarter members’ activity on the website between Sept 19th, 2017, and Dec 3rd, 2018. Sept 19th, 2017 marked the launch of SciStarter 2.0, which introduced the added functionality of member accounts, dashboards, and profile pages. Use of these secondary data was approved by the NC State University Institutional Review Board (IRB Protocol # 20934) prior to analysis.
At the time of data collection, volunteers could click buttons to either “join” or “bookmark” projects of interest on the SciStarter website. Additionally, volunteers could check a box to indicate that they had previously joined a project. Clicking “join” sent the volunteer to the project’s website (unless that project was exclusively hosted on SciStarter), and it automatically added the project to a list of joined projects on a volunteer’s profile. Thus, the join function in SciStater could be considered “conversion”, or an expression of interest in joining a project. Clicking “bookmark” added the project to a list of bookmarked projects on the volunteers’ profile. Volunteer activities like “joins” and “bookmarks” were recorded in the digital trace data, along with an anonymized participant ID number.
Analysis of the data contributions showed that “joining” a project is a better predictor of subsequent contributions than “bookmarking” a project; we were able to confirm that at least 30% of volunteers who clicked “join” on an affiliate project later contributed to that project, while only 10% of volunteers who clicked “bookmark” ended up contributing. These estimates may be low, however, as volunteers would have to be logged in through SciStarter for their keystrokes to be accurately recorded. Thus, given the SciStarter metrics available through the API at the time of data collection, “join” was the best proxy for project participation.
Files and variables
File: Futch.2020_NSNA-ProjectsAttributes.xlsx
Description: Attributes associated with all citizen science projects in the dataset
Variables
- Project Name = Common name of the project
- Project ID = unique project ID number from SciStarter
- R_online = online or offline project (1 = online, 0 = offline)
- project_topic = topic of project (1 = Earth & Life sciences, 2 = Behavioral & Social Sciences, 3 = Engineering & Physical Sciences, 4 = Health & Medicine)
- event = event or project (1 = event, 0 = actual project)
- affiliate = SciStarter affiliate status (0 = not affiliate, 1 = part-time affiliate, 2 = affiliate)
- campaign = whether or not project was featured in marketing campaign (0 = never featured, 1 = featured before data collection, 2 = featured during data collection, 3 = featured both during and before data collection)
File: Futch.2020_SNA-Edgelist.csv
Description: Participants associated with each unique citizen science project
Variables
- participant = participant ID number
- project_name = Common name of project
Code/software
All data can be viewed in MS Excel
Access information
Data was derived from the following sources:
- SciStarter.org
All data in these files were provided as anonymized digital trace data from SciStarter.org. See the description of the project coding protocol in Futch et al. (in review) for more details about the project codes.
All data in these two files are cleaned and ready to use. See the CodeBook tab on the Project Attribute file for more details about variables on that spreadsheet.