Scientific and public spider observation records from Sweden
Data files
Sep 06, 2025 version files 48.18 KB
-
comparison_SMTP_ARTP_fig_2-4.csv
20.12 KB
-
completeness_recognition_SMTP_INAT_fig_5.csv
1.98 KB
-
correlation_analysis_dens_obs.csv
5.13 KB
-
correlation_dens_obs_fig_6.csv
466 B
-
R_scripts.txt
11.71 KB
-
README.md
8.76 KB
Abstract
This project compares spider biodiversity records from two public citizen science platforms, iNaturalist and the Swedish biodiversity database Artportalen, and data from a national scientific survey, the Swedish Malaise Trap Project (SMTP). The analysis highlights the strengths and biases of each source in terms of spatial coverage and species representation. Citizen science platforms provide cost-effective and large-scale data collection but tend to overrepresent common, easily observed species in populated areas. Traditional scientific surveys contribute valuable records from rural areas and are especially effective at detecting rare and cryptic species. Combining these approaches improves our understanding of spider distributions across Sweden and supports more comprehensive biodiversity monitoring.
This dataset is part of the article “The value of public databases for our knowledge of national spider biodiversity compared to a long-term scientific monitoring project,” currently under review in the journal Insect Conservation and Diversity.
Data sources and collection:
SMTP spider data were collected using Malaise traps as part of a nationwide survey conducted between 2003 and 2006.
Artportalen spider observations were retrieved via Analysportalen in May 2024, prior to its replacement by Fynddata. All available spider records from Sweden at the time of access were included.
iNaturalist spider observations were manually extracted through a direct search in May 2024, filtered to include only verifiable records from Sweden. All such observations available at the time were included.
The dataset includes all the data and R-scripts necessary to reproduce the analyses and figures presented in the study, with figures 1 and 6 created using QGIS. It contains spider observations, species lists, the number of regions (e.g., provinces or grid cells) in which each species was recorded, species dominance rankings, red list statuses, and measures of recognition and completeness of spider families across the three data sources. In addition, it includes data on spider observation numbers (from Artportalen) and human population density (from Statistics Sweden) for Swedish municipalities and counties.
The following includes information about all the files in the dataset. Please see the descriptions below for details on the contents and purpose of each file.
Abbreviations:
SMTP: Swedish Malaise Trap Project
ARTP: Artportalen, the Swedish national biodiversity database
INAT: iNaturalist
Explanation of terms:
Observations: Indicates that a species has been recorded in a collection event, regardless of the number of individuals of a species that was observed.
Grid cells: A standardized spatial unit used for mapping and analyzing species observations. In this dataset, each grid cell represents a 50 × 50 km area covering part of Sweden. The entire country is divided into these equal-sized cells to facilitate geographic comparisons of species distributions.
Occurrence: denotes the presence of a species within a specific grid cell or province.
New occurrence: refers to the observation of a species in a grid cell or province where it had not been previously recorded.
README: comparison_SMTP_ARTP_fig_2-4.csv
This data was used to produce Figures 2–4, which compare SMTP and ARTP data and illustrate SMTP’s contribution to the ARTP database, in the following order: “Dominance ranks,” “Observation correlation to new occurrences,” and “Percentage contributions of spider families.” Data visualizations were carried out using the R programming language (version 4.4.0). The file “R-scripts” contains the code used to generate these figures, with additional explanations provided for each step.
Column Descriptions:
Species:
Scientific name of each spider species collected through the SMTP project.
Family:
Taxonomic family to which each species belongs.
SMTP_observations:
The total number of observations per species in the SMTP project.
ARTP_observations:
The total number of observations per species in the ARTP database.
SMTP_rank:
Relative frequency rank of each species within the SMTP dataset. A higher rank indicates more frequent observations.
Artportalen_rank:
Relative frequency rank of each species within the ARTP dataset. A higher rank indicates more frequent observations.
SMTP_new_occurrences_grid_cells:
The number of new occurrences across grid cells per species in ARTP which were contributed by the SMTP project.
ARTP_occurrences_grid_cells:
Total number of grid cells in which each species was recorded in ARTP.
perc_total_records:
The percentage contribution of occurrences across grid cells from the SMTP to the total occurrences of grid cells in Artportalen
SMTP_new_occurrences_provinces:
Number of Swedish provinces where a species was newly recorded in ARTP due to the SMTP project.
Red_list_Sweden:
Conservation status of each species according to the Swedish Red List.
README: completeness_recognition_SMTP_INAT_fig_5.csv
This data was used to produce Figure 5, which presents completeness and recognition scores for 11 common spider families based on INAT observations, as well as the effect of including SMTP data on completeness. Data visualizations were carried out using the R programming language (version 4.4.0). The file “R-scripts” contains the code used to generate the figure, with additional explanations provided for each step.
Column Descriptions:
Family:
List of Swedish spider famililes
INAT_observations:
Total number of spider observations per family recorded in INAT, regardless of identification level.
INAT_to_species_level:
Number of spider observations per family recorded in INAT, that were identified to species level.
Recognition_INAT:
Proportion of INAT observations that were identified to the species level per family (i.e., species-level identifications divided by total INAT observations).
INAT_number_of_species:
Number of spider species recorded in INAT per family.
Completeness_INAT:
Proportion of all known Swedish spider species that were recorded in INAT per family. Calculated as:
INAT_#_of_species /#_of_species_Sweden.
#_of_species_SMTP:
Number of spider species recorded in the SMTP per family.
Completeness_SMTP:
Proportion of all known Swedish spider species recorded in SMTP per family.
Calculated as: #_of_species_SMTP / #_of_species_Sweden.
Added_species_by_SMTP:
Number of spider species that were found in SMTP but not in INAT per family.
#_of_species_INAT_SMTP:
Total number of spider species found across both INAT and SMTP combined per family (i.e., union of species lists).
Completeness_SMTP_INAT:
Proportion of all Swedish spider species captured when combining INAT and SMTP records per family.
Calculated as: #_of_species_INAT_SMTP / #_of_species_Sweden.
#_of_species_ARTP:
Number of spider species recorded in ARTP per family.
Completeness_ARTP:
Proportion of all known Swedish spider species recorded in ARTP per family.
Calculated as: #_of_species_ARTP / #_of_species_Sweden.
#_of_species_Sweden:
Total number of spider species known to occur in Sweden per family(used as a reference for completeness metrics).
README: correlation_analysis_dens_obs.csv
This data was used for the correlation analysis between population density (citizens/km²) of municipalities and spider observations in Sweden. Data analysis were carried out using the R programming language (version 4.4.0). The file “R-scripts” contains the code used to generate the analysis, with additional explanations provided for each step.
Spider observations were retrieved via Analysportalen in May 2024, prior to its replacement by Fynddata. All available spider records from Sweden at the time of access were included. Population density for municipalities was retrieved from Statistics Sweden.
Column Descriptions:
Municipality:
Name of the Swedish municipality in which data were collected.
Obs:
Total number of spider observations recorded within the municipality.
Density:
Human population density of the municipality, typically expressed as the number of people per square kilometer.
README: correlation_dens_obs_fig_6.csv
This data was used to produce Figure 6, which illustrates the correlation between county-level population density (citizens/km²) and the number of spider observations in Sweden. The figure was generated using QGIS (version 3.34.14-Prizren).
Spider observations were retrieved via Analysportalen in May 2024, prior to its replacement by Fynddata. All available spider records from Sweden at the time of access were included. Population density for counties was retrieved from Statistics Sweden.
Column Descriptions:
Municipality:
Name of the Swedish municipality in which data were collected.
Obs:
Total number of spider observations recorded within the municipality.
Density:
Human population density of the municipality, typically expressed as the number of people per square kilometer.
README: R_scripts.txt
This file contains all code used to generate Figures 2–5, as well as the correlation analysis between human population density and the number of spider observations across the municipalities of Sweden, using the R programming language (version 4.4.0). Each step of the script is annotated with explanations describing the purpose and function of the code.
