Species' geographic ranges vary enormously, and even closest relatives may differ in range size by several orders of magnitude. With data from hundreds of species spanning 20 genera in 15 families, we show that plant species that autonomously reproduce via self-pollination consistently have larger geographic ranges than their close relatives that generally require two parents for reproduction. Further analyses strongly implicate autonomous self-fertilisation in causing this relationship, as it is not driven by traits such as polyploidy or annual life history whose evolution is sometimes correlated with selfing. Furthermore, we find that selfers occur at higher maximum latitudes and that disparity in range size between selfers and outcrossers increases with time since their evolutionary divergence. Together, these results show that autonomous reproduction—a critical biological trait that eliminates mate limitation and thus potentially increases the probability of establishment—increases range size.
Geographic data organized by genus
For each genus in the study there are two types of files: 1) "orignal.gbif.genus" contains all known occurrence records for the species in our study as downloaded from the Global Biodiversity Information Facility (http://www.gbif.org) 2) "ft.clean.gbif.genus" contains the filtered and cleaned records used in downstream analyses (note that scripts to generate these cleaned files are in folder "get.clean.geo").
data geographic.zip
Mating system data organized by genus
For each genus in the study there is a unique file "genus.csv" that contains the following data columns: species names (species), mating system category (mate.sys), method used to assign mating system (method), source (ref), notes (notes), and availability of loci on genbank (rps, its, trn).
data mating system.zip
Phylogenetic data orangized by genus
For each genus, there are several files: original genbank download of its sequences (original.genbank.genus.fasta), aligned its sequences (genus.its.aligned.fa), BEAST xml file (genus.its.aligned.xml), BEAST output files (genus.its.aligned.log, genus.its.aligned.ops, genus.its.aligned.trees), subset of 9000 trees from the posterior (genus.its.final.tree), and consensus tree used only for data visualization (genus.its.consensus.tree).
data phylogenetic.zip
Correlated trait data table
This data table contains the following columns: species name (species), number of gametic chromosomes (X1N), qualitative ploidy assignment (ploidy.qual), notes on ploidy (ploidy.notes), source for ploidy (ploidy.source), qualitative life history assignment (life.history.qual), notes on life history (ann.per.notes), life history source (ann.per.source)
data.correlated.traits.csv
R scripts for filtering geographic data
For each genus there is a unique R script used to filter and clean the original geographic data downloaded from the Global Biodiversity Information Facility (http://www.gbif.org). These scripts filter GBIF data for quality by excluding records with coordinate accuracy < 100 km, coordinates failing to match the locality description, and taxonomic misidentifications (verified by the authors and taxonomic specialists of each clade). Species’ epithets were checked against the most recently published taxonomies and synonyms and spelling errors are corrected here. Finally, coordinates outside the native species range with published monographs and online databases that report native and invaded ranges (e.g. GRIN database, http:// www.ars-grin.gov/), are identified and excluded.
get.clean.geo.zip
R script for grabbing sister pairs across a sample of trees
This R script grabs all sister pairs from across a sample of trees,computes the average branch length (dist), computes the posterior probability (pp = proportion of sampled trees with a given sister pair), and grabs the mating system of the sister pair.
get.sisters.R
R script for calculating sister pair range size and co-occurrence
This R script takes a list of sister pairs (generated using script "get.sisters"), computes their geographic range sizes across varying spatial scales, and computes various metrics of co-occurrence used in American Journal of Botany paper "No association between plant mating system and geographic range overlap".
get.sisters.rangesize.cooccur.R
R script for range size stats and figures
This R script performs statistical analyses and generates figures used in Ecology Letters paper "Geographic range size is predicted by plant mating system".
get.stats.rangesize.R
R script for co-occurrence stats and figures
This R script performs statistical analyses and generates figures used in Ecology Letters paper American Journal of Botany paper "No association between plant mating system and geographic range overlap".
get.stats.cooccurrence.R
Data summary table of sister pairs
This data table contains summary information of sister pairs used in final analyses and for generating figures (see R scripts "get.stats.cooccurrence" and "get.stats.rangesize"), and includes the following columns:
genus (genus);
sister species A (sp.A);
sister species B (sp.B);
sister species A and B separated by underscore (ba);
branch length (dist);
posterior probability (pp);
weighting factor using for data visualization (fig.weight);
mating system classification of (sp.A.mate, sp.B.mate, mate.diff);
range size of species A and B across 4 spatial resolutions (A.range.res1, B.range.res1, A.range.res0.5, B.range.res0.5, A.range.res0.1, B.range.res0.1, A.range.res0.05, B.range.res0.05);
co-occurrence index across 4 spatial resolutions (overlap.min.res1, overlap.min.res0.5, overlap.min.res0.1, overlap.min.res0.05);
latitudinal characteristics of species A and B (lat.mean.spA, lat.med.spA, lat.min.spA, lat.max.spA, lat.mean.spB,lat.med.spB, lat.min.spB, lat.max.spB);
ploidy category for species A and B (ploidy.spA, ploidy.spB);
life history category for species A and B (life.history.spA, life.history.spB)
sister.summary.table.csv