Species are the starting point for most studies of ecology and evolution, but the proper circumscription of species can be extremely difficult in morphologically variable lineages, and there are still few convincing examples of molecularly-informed species delimitation in plants. We focus here on the Viburnum nudum complex, a highly variable clade that is widely distributed in eastern North America. Taxonomic treatments have mostly divided this complex into northern (V. nudum var. cassinoides) and southern (V. nudum var. nudum) entities, but additional names have been proposed. We used multiple lines of evidence, including RADseq, morphological, and geographic data, to test how many independently evolving lineages exist within the V. nudum complex. Genetic clustering and phylogenetic methods revealed three distinct groups—one lineage that is highly divergent, and two others that are recently diverged and morphologically similar. A combination of evidence that includes reciprocal monophyly, lack of introgression, and discrete rather than continuous patterns of variation supports the recognition of all three lineages as separate species. These results identify a surprising case of cryptic diversity in which two broadly sympatric species have consistently been lumped in taxonomic treatments. The clarity of our findings is directly related to the dense sampling and high quality genetic data in this study. We argue that there is a critical need for carefully sampled and integrative species delimitation studies to clarify species boundaries even in well-known plant lineages. Studies following the model that we have developed here are likely to identify many more cryptic lineages and will fundamentally improve our understanding of plant speciation and patterns of species richness.
Sequence_data_min10
All ipyrad outfiles for the entire V. nudum complex + outgroups with all loci shared across at least 10 individuals
nudum-c88-d6-min10_outfiles.zip
Sequence_data_min20
All ipyrad outfiles for the entire V. nudum complex + outgroups with all loci shared across at least 20 individuals
nudum-c88-d6-min20_outfiles.zip
Sequence_data_files_min40
All ipyrad outfiles for the entire V. nudum complex + outgroups with all loci shared across at least 40 individuals
nudum-c88-d6-min40_outfiles.zip
morphological_trait_data
Table with all measured morphological traits.
Figure S1. Heatmap of Shared Data Across All Pairs of Individuals
Figure S1. Heatmap of shared data across all pairs of individuals shows no systematic biases in missing data. Darker blues indicated more shared loci while lighter blues indicate fewer. Individuals are arranged phylogenetically, and the total number of loci per individual is displayed along the top of the chart as a bar graph.
S1_heatmap_min40.pdf
Figure S2. Results of Hierarchical STRUCTURE Analysis
Figure S2. Hierarchical STRUCTURE analysis identified clusters within each of the three major lineages. Colors indicate posterior probability of assignment of each individual to a particular cluster based on the combination of 10 replicate runs. The number of clusters (K) for each analysis is displayed at the top of each bar graph.
S2_structure.pdf
Figure S3. Phylogenetic Results
Figure S3. Similar phylogenetic trees were inferred across datasets and analysis methods. a) IQ-TREE min10 data set, b) IQ-TREE min20 data set, c) IQ-TREE min50 data set, d) RAxML min10 data set, e) RAxML min20 data set, f) RAxML min40 dataset, g) RAxML min50 data set, h) tetrad min10 data set, i) tetrad min20 data set j) tetrad min40 data set, k) tetrad min50 data set.
S3_phylogenetic_results.pdf
Figure S4. Phylogenetic Network Inferred with SplitsTree
Figure S4. Phylogenetic network inferred with SplitsTree4 (Hudson and Bryant 2006) showing relationships for all individuals in the V. nudum complex based on the min40 dataset.
S4_splitstree.pdf
Figure S5. D-Statistic Tests of Introgression
Figure S5. D-statistic tests find no evidence of introgression between the red and blue clades. A-F) Phylogeny displays the taxa for each test with the grey arrows indicating potential introgression. The graph to the right of each tree shows a histogram of z-scores for 100 replicate tests with different individuals sampled for each non-focal taxon. Higher Z-scores (lower p-values) are located on the right of each graph. Grey dotted line is at p=.01, black dotted line is at p=.01 with a bonferroni correction for 100 tests.
S5_D_statistic.pdf
Figure S6. Pairwise Geographic and Genetic Distances Between Individuals Within Each Species
Figure S6. Pairwise geographic and genetic distances between individuals within each species. A) V. cassinoides b) V. nitidum c) V. nudum. Line representing the correlation of genetic and geographic distances is plotted for V. cassinoides.
S6_IBD_species_separate.pdf
Figure S7. County Range Maps.
Figure S7. United States county maps colored to reflect the occurrence patterns of each of the three species: a) V. cassinoides b) V. nitidum c) V. nudum. Species occurrences were determined by visual inspection of herbarium specimens and georeferenced to county. d) all three species are shown together with V. cassinoides in blue, V. nitidum in red, and V. nudum in green. Counties with more than one species are black.
S7_county_range_maps.pdf
Figure S8. Species Distribution Models with Six Predictor Variables
Figure S8. Species distribution models using Maxent with six predictor variables downloaded from Worldclim (http://www.worldclim.org) at a 2.5 arc-minute resolution (mean annual temperature, annual precipitation, and precipitation seasonality, mean temperature of the wettest quarter, mean diurnal temperature, and precipitation of the wettest quarter) to infer the current (a,c,e) and LGM (b,d,f) distributions of each species Occurrence data for each species is plotted as points on the current distribution map. Relative probability of occurrence at each grid cell is indicated by shading.
S8_SDM_6var.pdf
Table S1. RAD Coverage and Sampling
Table S1. For each individual the number of raw reads, filtered reads, and loci in each of the four datasets is displayed.
TableS1.pdf
Table S2. List of Specimens Used for Range Maps
Table S2. List of specimens that were examined and included in the range maps.
tableS2.pdf
Sequence_data_files_min50
All ipyrad outfiles for the entire V. nudum complex + outgroups with all loci shared across at least 50 individuals
nudum-c88-d6-min50_outfiles.zip
Script to calculate genetic distances
Python script that can be used to calculate pairwise genetic distances between individuals. The input is an ipyrad .loci file and the output is a table that lists the number of loci shared between two individuals, the number of sites that match between them, and the total number of sites they share.
Calculate genetic distances.ipynb