Skip to main content
Dryad

Phenotypic and genetic diversity data recorded in island and mainland populations worldwide

Cite this dataset

Csergő, Anna Mária et al. (2023). Phenotypic and genetic diversity data recorded in island and mainland populations worldwide [Dataset]. Dryad. https://doi.org/10.5061/dryad.h18931zqg

Abstract

We used this dataset to assess the strength of isolation due to geographic and macroclimatic distance across island and mainland systems, comparing published measurements of phenotypic traits and neutral genetic diversity for populations of plants and animals worldwide. The dataset includes 112 studies of 108 species (72 animals and 36 plants) in 868 island populations and 760 mainland populations, with population-level taxonomic and biogeographic information, totalling 7438 records.

README

This readme file was generated on 2023-08-30 by ANNA MÁRIA CSERGŐ

GENERAL INFORMATION

Title of Dataset: Phenotypic and genetic diversity data recorded in
island and mainland populations worldwide

SHARING/ACCESS INFORMATION

Licenses/restrictions placed on the data: CC0

Links to publications that cite or use the data: DOI:
DOI 10.22541/au.167542026.68326456/v1

Links to other publicly accessible locations of the data: n/a

Links/relationships to ancillary data sets: n/a

Was data derived from another source? yes Reference details of the
papers from which we extracted the data were added to the dataset.

Recommended citation for this dataset: Csergő AM et al. 2023, Data from:
Spatial phenotypic variability is higher between
island populations than between mainland populations worldwide, Dryad, Dataset, DOI:

DATA & FILE OVERVIEW

File List: 1. "FinalFileDryad_Csergo et al.2023_Ecography.csv":

Additional related data collected that was not included in the current
data package: Further data was collected on demographic performance and
traits such as immunity, behaviour and diet more dependent on ecosystem
context. Further data is available from terrestrial landmasses in the
middle of a lake or river, categorised as "lake island" or "river
island", and marine coastline organisms.

Are there multiple versions of the dataset? n/a

Standards and calibration information, if appropriate: n/a

Environmental/experimental conditions: n/a

Describe any quality-assurance procedures performed on the data: Papers
that passed the first filter were redistributed randomly between the
coauthors who performed the second filter. Upon completion of data
extraction for the assigned papers, each coauthor verified the standards
of the own dataset and referred back to the original paper if it was
necessary. The data spreadsheet was cleaned using R software for typos,
any mistakes, missing data, and data was brought to common standards.
This step involved first collective effort and it was subsequently
progressed by the first author.

People involved with sample collection, processing, analysis and/or
submission: Anna M Csergő, Kevin Healy, Maude E A Baudraz, David J
Kelly, Darren P O'Connell, Fionn Ó Marcaigh, Annabel L. Smith, Jesus
Villellas, Cian White, Qiang Yang, Yvonne M Buckley

DATA-SPECIFIC INFORMATION
Number of variables (columns): 44 columns corresponding to the data
extracted from the indicated papers
column1:StudyID: Unique study identifier
column2:Species: Latin name of the taxon
column3:GeographicUnit: Name of the island and the generic name of the
study region on mainland; the mainland unit may often not be given; it
could be e.g. two different continents or large geographic division
within one continent or a country (Note that the geographic unit on the
mainland can be subjective)
column4:PopulationName: The name of the population (the main study unit)
column5:PopulationID: Unique identifier of the population in a study
(starts with P1, continues with P2, P3)
column6:Sub.populationName: Name of the plot or samples (if multiple
plots or samples were recorded within a population)
column7:Sub.populationID: Unique identifier of the Sub-Population
(starts with SP1, SP2, SP3 etc.)
column8:GeogrPositionReal: Whether the geographic unit is an island or a
mainland in reality, as indicated by the authors of the study.
column9:GeogrPositionInterpreted: Whether the geographic unit is an
island or a mainland as interpreted by the data extractor; the real and
interpreted positions may differ (e.g., a big island can be interpreted
as "mainland" relative to a small island).
column10:Lat: Latitude (decimal degrees) of the population (for islands
if population coordinates are unavailable, then the island coordinates).
column11:Long: Longitude (decimal degrees) of the population (for
islands if population coordinates are unavailable, then the island
coordinates).
column12:Method.to.determine.Location: Whether the data was provided or
data was derived from maps using Google Maps
column13:Location.comments: Additional comment about the location
(precision, accuracy)
column14:ConfoundingFactor: Any clearly identified confounding factor
(e.g.disturbance on mainland but not on islands) that could influence
the results
column15:categoryResponseVariable: Type of population-level data:
Genetic, Other (Phenotypic)
column16:ResponseVariable: for the 'Genetic' CategoryResponseVariable:
Allelic Richness, Gene Diversity, Genotype Diversity, Haplotype
Diversity, Heterozygosity, Inbreeding Coefficient, Linkage
Disequilibrium, Nucleotide Diversity, Number Alleles, Number Alleles Per
Locus, Number of Haplotypes, Ploidy, Polymorphism, Uniqueness, Genome
Size, Other. For the 'Other (Phenotypic)' CategoryResponseVariable: Mean Age of
Individuals, Behaviour, Body Weight, Immunity, Metabolism Products,
Morphology, Physiology, Population Size, Population Structure, Size (of
body or organs), Vital Rate Note: mn=population mean
column17:specificResponseVariable: Specific response variables within
each ResponseVariable
column18:ResponseVariableValue: Quantitative data of interest
column19:UnitsResponseVariable: g, mm, %, etc.
column20:SampleSize: How many individuals were measured to arrive to the
population mean
column21:MethodtoExtractData: Whether the values were provided in the
paper or upon request from the authors, or PlotDigitizer was used to
extract data from figures
column22:Lat_fin: Final latitude (decimal degrees) of the population,
after minor adjustments were made to some original coordinates upon
visual inspection on Google Maps (e.g., coordinate was off the island of
study).
column23:Long_fin: Final longitude (decimal degrees) of the population,
after minor adjustments were made to some original coordinates upon
visual inspection on Google Maps (e.g., coordinate was off the island of
study).
column24:bio1: mean annual temperature (°C) in the grid at 10 min resolution
available in CliMond V1.2 (Kriticos et al. 2012)
column25:bio4: temperature seasonality (C of V) in the grid at 10 min resolution
available in CliMond V1.2 (Kriticos et al. 2012)
column26:bio12: annual precipitation (mm) in the grid at 10 min resolution
available in CliMond V1.2 (Kriticos et al. 2012)
column27:bio15: precipitation seasonality (C of V) in the grid at 10 min
resolution available in CliMond V1.2 (Kriticos et al. 2012)
column28:bio1norm: normalised mean annual temperature (°C) in the grid at 10
min resolution available in CliMond V1.2 (Kriticos et al. 2012)
column29:bio4norm: normalised temperature seasonality (C of V) in the grid at 10
min resolution available in CliMond V1.2 (Kriticos et al. 2012)
column30:bio12norm: normalised annual precipitation (mm) in the grid at 10
min resolution available in CliMond V1.2 (Kriticos et al. 2012)
column31:bio14norm: normalised precipitation seasonality (C of V) in the grid at
10 min resolution available in CliMond V1.2 (Kriticos et al. 2012)
column32:PCA1_norm: first axis of the PCA based on normalised climate
values
column33:PCA2_norm: second axis of the PCA based on normalised climate
values
column34:PCA3_norm: third axis of the PCA based on normalised climate
values
column35:PCA4_norm: fourth axis of the PCA based on normalised climate
values
column36: changed_species: species names in the phylogenetic tree
column37:taxa: taxonomic group: "Mammalia", "Squamata", "Angiosperms", "Aves", "Amphibia", "Lepidoptera", "Hymenoptera", "Hemiptera", "Actinopterygii", "Liverwort", "Pteridophyta", "Gastropoda", "Bryophyta", "Pinophyta", "Coleoptera"
column38: kingdom: plant or animal
column39:TaxonBiogeogrOrigin: whether the populations studied are in the
native or non-native range or both
column40:IslandSystemDetail: selected saltwater category
column41:Island.origin: origin of the oceanic island (volcanic,
continental shelf, unknown, mixed, one freshwater species within marine islands)
column42:StudyDesign: selected "Natural system" category
column43:Study: author of the paper and year of publication
column44: ReferenceDetail: author(s), paper full title, journal name,
issue, page numbers.
Number of cases/rows: 7438 rows corresponding to individual populations
for which data was extracted.
Missing data codes: NA
Specialized formats or other abbreviations used: not applicable

Methods

Description of methods used for collection/generation of data: 

We searched the ISI Web of Science in March 2017 for comparative studies that included data on phenotypic traits and/or neutral genetic diversity of populations on true islands and on mainland sites in any taxonomic group. Search terms were 'island' and ('mainland' or 'continental') and 'population*' and ('demograph*' or 'fitness' or 'survival' or 'growth' or 'reproduc*' or 'density' or 'abundance' or 'size' or 'genetic diversity' or 'genetic structure' or 'population genetics') and ('plant*' or 'tree*' or 'shrub*or 'animal*' or 'bird*' or 'amphibian*' or 'mammal*' or 'reptile*' or 'lizard*' or 'snake*' or 'fish'), subsequently refined to the Web of Science categories 'Ecology' or 'Evolutionary Biology' or 'Zoology' or 'Genetics Heredity' or 'Biodiversity Conservation' or 'Marine Freshwater Biology' or 'Plant Sciences' or 'Geography Physical' or 'Ornithology' or 'Biochemistry Molecular Biology' or 'Multidisciplinary Sciences' or 'Environmental Sciences' or 'Fisheries' or 'Oceanography' or 'Biology' or 'Forestry' or 'Reproductive Biology' or 'Behavioral Sciences'. The search included the whole text including abstract and title, but only abstracts and titles were searchable for older papers depending on the journal. The search returned 1237 papers which were distributed among coauthors for further scrutiny.

First paper filter

To be useful, the papers must have met the following criteria:

Overall study design criteria: Include at least two separate islands and two mainland populations; Eliminate studies comparing populations on several islands where there were no clear mainland vs. island comparisons; Present primary research data (e.g., meta-analyses were discarded); Include a field study (e.g., experimental studies and ex situ populations were discarded); Can include data from sub-populations pooled within an island or within a mainland population (but not between islands or between mainland sites);

Island criteria: Island populations situated on separate islands (papers where all information on island populations originated from a single island were discarded); Can include multiple populations recorded on the same island, if there is more than one island in the study; While we accepted the authors' judgement about island vs. mainland status, in 19 papers we made our own judgement based on the relative size of the island or position relative to the mainland (e.g. Honshu Island of Japan, sized 227 960 km² was interpreted as mainland relative to islands less than 91 km²); Include islands surrounded by sea water but not islands in a lake or big river; Include islands regardless of origin (continental shelf, volcanic);

Taxonomic criteria: Include any taxonomic group; The paper must compare populations within a single species; Do not include marine species (including coastline organisms);

Databases used to check species delimitation: Handbook of Birds of the World (www.hbw.com/); International Plant Names Index (https://www.ipni.org/); Plants of the World Online
(https://powo.science.kew.org/); Handbook of the Mammals of the World; Global Biodiversity Information Facility (https://www.gbif.org/);

Biogeographic criteria: Include all continents, as well as studies on multiple continents; Do not include papers regarding migratory species; Only include old / historical invasions to islands (>50 yrs); do not include recent invasions;

Response criteria: Do not include studies which report community-level responses such as species richness; Include genetic diversity measures and/or individual and population-level phenotypic trait responses;

The first paper filter resulted in 235 papers which were randomly reassigned for a second round of filtering.

Second paper filter

In the second filter, we excluded papers that did not provide population geographic coordinates and population-level quantitative data, unless data were provided upon contacting the authors or could be obtained from figures using DataThief (Tummers 2006). We visually inspected maps plotted for each study separately and we made minor adjustments to the GPS coordinates when the coordinates placed the focal population off the island or mainland. For this study, we included only responses measured at the individual level, therefore we removed papers referring to demographic performance and traits such as immunity, behaviour and diet that are heavily reliant on ecosystem context. We extracted data on population-level mean for two broad categories of response: i) broad phenotypic measures, which included traits (size, weight and morphology of entire body or body parts), metabolism products, physiology, vital rates (growth, survival, reproduction) and mean age of sampled mature individuals; and ii) genetic diversity, which included heterozygosity,
allelic richness, number of alleles per locus etc. The final dataset includes 112 studies and 108 species.

Methods for processing the data:

We made minor adjustments to the GPS location of some populations upon visual inspection on Google Maps of the correct overlay of the data point with the indicated island body or mainland. For each population we extracted four climate variables reflecting mean and variation in temperature and precipitation available in CliMond V1.2 (Kritikos et al. 2012) at 10 minutes resolution: mean annual temperature (Bio1), annual precipitation (Bio12), temperature seasonality (CV) (Bio4) and precipitation seasonality (CV) (Bio15) using the "prcomp function" in the stats package in R. For populations where climate variables were not available on the global climate maps mostly due to small island size not captured in CliMond, we extracted data from the geographically closest grid cell with available climate values, which was available within 3.5 km away from the focal grid cell for all localities. We normalised the four climate variables using the "normalizer" package in R (Vilela 2020), and we performed a Principal Component Analysis (PCA) using the psych package in R (Revelle 2018). We saved the loadings of the axes for further analyses.

References:

  • Bruno Vilela (2020). normalizer: Making data normal again.. R package version 0.1.0.
  • Kriticos, D.J., Webber, B.L., Leriche, A., Ota, N., Macadam, I., Bathols, J., et al.(2012). CliMond: global high-resolution historical and future scenario climate surfaces for bioclimatic modelling. Methods Ecol. Evol., 3, 53--64.
  • Revelle, W. (2018) psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA, https://CRAN.R-project.org/package=psych Version = 1.8.12.
  • Tummers, B. (2006). DataThief III. https://datathief.org/

Usage notes

Data can be opened in multiple versions of Excel and R. Data encoding: UTF-8, UTF-16BE, UTF-16LE 

Funding

European Commission, Award: GEODEM-658651

Irish Research Council, Award: GOIPG/2014/13046, Postgraduate Scholarships

Irish Research Council, Award: GOIPG/2017/1618, Postgraduate Scholarships

Irish Research Council, Award: 2017/2018 IRCLA/2017/60, Laureate Awards