Skip to main content
Dryad

CanFlyet: Habitat zone and diet trait dataset for Diptera species of Canada and Greenland

Cite this dataset

Majoros, Samantha (2023). CanFlyet: Habitat zone and diet trait dataset for Diptera species of Canada and Greenland [Dataset]. Dryad. https://doi.org/10.5061/dryad.fqz612jwx

Abstract

True flies (Diptera) are an ecologically important group that play a role in agriculture, public health, and ecosystem functioning. As researchers continue to investigate this order, it is beneficial to link the growing occurrence data to biological traits. However, large-scale ecological trait data are not readily available for fly species. While some databases and datasets include fly data, many ecologically relevant traits for taxa of interest are not included. In this dataset we provide ecological traits (habitat and diet) for fly species of Canada and Greenland having occurrence records on the Barcode of Life Data Systems (BOLD). Trait data were compiled based on literature searches conducted from April 2021 - January 2023 and assigned at the lowest taxonomic level possible. The dataset contains traits for 983 species across 380 genera, 25 subfamilies, and 61 families. This dataset allows for assignment of traits to occurrence data for Diptera species and can be used for further research into the ecology, evolution, and conservation of this order. 

Methods

The fly species were chosen for inclusion in this dataset by first downloading data for Diptera from Canada and Greenland from BOLD directly into R using BOLD’s application programming interface (API) on June 24th, 2021. The records were filtered based on the requirements outlined in Majoros et al. (2023), and the remaining species were chosen for analysis and inclusion in this dataset. Additional species from Greenland were chosen based on occurrence records from GBIF (GBIF.org (June 24th, 2021)) GBIF Occurrence Download (https://doi.org/10.15468/dl.mk52hp) and included in the dataset. The biological traits for each species were determined and assigned through literature searches conducted from April 2021 - January 2023. Through the Omni Academic search tool available through the University of Guelph and Google Scholar, traits were found using the following search terms: trait AND “Taxonomic name”, habitat AND “Taxonomic name”, diet AND “Taxonomic name”, and “Feeding mode” AND “Taxonomic name”. Traits were assigned to the lowest taxonomic level possible; however, not all traits could be assigned at the species level. For these species, traits were assigned using data from the next lowest level available, whether genus, subfamily, or family. The full analysis this data was used for is outlined in Majoros et al. (2023) and the code is available at https://github.com/S-Majoros/Population_Genetic_Structure. The data was put into DarwinCore format using the R package traitdatafrom version 0.6.8 (Schneider et al., 2019) in the R programming language. 

  • Majoros, S.E., Adamowicz, S.J. & Cottenie, K. (2023). Novel pipeline for large-scale comparative population genetics. bioRxiv. doi: 10.1101/2023.01.23.524574
  • Schneider,  F.D., Jochum, M., Le Provost, G., Ostrowski, A., Penone, C., Fichtmüller, D., Güntsch, A., Gossner, M.M., König-Ries, B., Manning, P., Simons, N.K. (2019). Towards an Ecological Trait-data Standard. Methods in Ecology and Evolution, 10(12), 2006-2019. doi: 10.1111/2041-210X.13288

Usage notes

The dataset is provided in three formats. Only the xlsx file requires Microsoft Excel to open. All files contain the same dataset and information. 

Funding

Natural Sciences and Engineering Research Council

Genome Canada

Ontario Genomics

Ministry of Economic Development, Job Creation and Trade

Canada First Research Excellence Fund