# Title of Dataset: Complex urban environments provide Apis mellifera with a richer plant forage than suburban and more rural landscapes. --- Brief summary of dataset contents, contextualized in experimental procedures and results. ## Description of the Data and file structure Fox_et_al_analysis.R This is an R script covering the analysis detailed in the paper. This script reads in the other data files included in the repository. Included are commands to install all dependencies (written and tested on Ubuntu Linux), perform all analysis and generate plots. ITS2p_dada2_genus_assignments.csv The ITS2_plant data was processed using the DADA2 workflow and assigned taxonomy against the “Pollen/Plant ITS2 reference set for the RDP/UTAX classifier (2015)” database (Sickel et al., 2015). The counts in this "genus" file are those ASVs which achieved genus level only (not species). The numbers in this file do not include counts for ASVs which got better resolution than genus. ITS2p_dada2_species_assignments.csv The ITS2_plant data was processed using the DADA2 workflow and assigned taxonomy against the “Pollen/Plant ITS2 reference set for the RDP/UTAX classifier (2015)” database (Sickel et al., 2015). The counts in this "species" file are those ASVs which achieved species level resolution. rbcL_dada2_genus_assignments.csv The rbcL data was processed using the DADA2 workflow and assigned taxonomy against the ““rbcL reference library” (https://doi.org/10.6084/m9.figshare.c.3466311.v1)” database (Bell & Brosi et al., 2016). The counts in this "genus" file are those ASVs which achieved genus level only (not species). The numbers in this file do not include counts for ASVs which got better resolution than genus. rbcL_dada2_species_assignments.csv The rbcL data was processed using the DADA2 workflow and assigned taxonomy against the ““rbcL reference library” (https://doi.org/10.6084/m9.figshare.c.3466311.v1)” database (Bell & Brosi et al., 2016). The counts in this "species" file are those ASVs which achieved species level resolution. plant_metadata.csv A file of metadata pertaining to each of the plant species detected in the honey. This file includes data on the Status, Habit, Habitat, information on flowering period, possibility of presence in the UK, and sources for each of these data. ## Description of abbreviations used in the data files In the genus and species assignment files, each apiary studied has a two letter code in the column header. The two letter codes relate to the names of the sites, which relate to the single letter codes in the manuscript. The two letter codes are as follows: PW = Printworks = B MC = Manchester Cathedral = C CM = Chorlton Meadow = H WC = Warrington = F AG = Art Gallery = A MM = Manchester Museum = D CC = Chorlton = E PH = Oldham = M ML = Northenden = G HH = Pendlebury = I JH = Cowpe = L MB = Bury = J SK = Stockport = N SP = Sale = K The site names are the same as those used in the Land.csv data file. The Land.csv is the output from the LECOS 'Landscape Vector Overlay' function in QGIS. It contains data relating to 19 types of land cover found in the buffers surrounding the apiaries. The 19 types of land cover each coded with a two letter abbreviation. These are as follows: BW - Broadleaved woodland CW - Coniferous woodland AR - Arable & horticulture IG - Improved grassland NG - Neutral grassland CG - Calcareous grassland AG - Acid grassland FS - Fen, marsh, and swamp HR - Heather HG - Heather grassland BO - Bog IR - Inland rock SW - Saltwater FW - Freshwater SS - Supra-littoral sediement LS - Littoral sediment SM - Saltmarsh UB - Urban SB - Suburban ## Describe relationship between data files, missing data codes, other abbreviations used. Be as descriptive as possible. The R script reads all the other data files and covers the main analysis. The genus/species assignments files detail the plant species detected in the honey of each of 14 hives. The plant_metadata file contains appropriate metadata for each of those plant species detected in the honey. The Land.csv data contains the proportions of each of 21 types of land cover surround the 14 apiaries. The raw sequence data is not published here, but is available in the N.C.B.I. sequence reads archive under accessions (SRR16143791 to SRR16143818). These raw sequence data files are not required to recreate the analysis contained in the R script. ## Sharing/access Information Links to other publicly accessible locations of the data: N.C.B.I. SRA link to the raw sequence data: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA767686 Alternate link to download data files and R script: https://github.com/graemefox/Mancester_honey Was data derived from another source? If yes, list source(s): The land use data was derived from the "Land Cover Map 2015 (25m raster, GB)" dataset (Rowland et al. 2017). https://catalogue.ceh.ac.uk/documents/bb15e200-9349-403c-bda9-b430093807c7 The plant metadata was sourced from many different sources, Details of each are specified within the metadata file itself.