Data from: Rarity and incomplete sampling in DNA-based species delimitation

Ahrens D, Fujisawa T, Krammer H, Eberle J, Fabrizi S, Vogler AP

Date Published: January 19, 2016

DOI: http://dx.doi.org/10.5061/dryad.h6q11

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title MS_Ahrensetal_SupplementFigure1
Downloaded 26 times
Description Supplementary Fig. 1. Map of collecting sites (numbers refer to Supplementary Table 1).
Download MS_Ahrensetal_SupplementFigure1.pdf (182.5 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure2
Downloaded 25 times
Description Supplementary Fig. 2. Ultrametric tree of the southern African Sericini species showing tip labels for each haplotype, branch support values (aLRT) as well as the principal clades analysed separately.
Download MS_Ahrensetal_SupplementFigure2.pdf (633.0 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure3
Downloaded 27 times
Description Supplementary Fig. 3. The fit of the GMYC model to cox1 data of the subclades (A, C, E, G, M, Q, R) and the complete Sericini data set (All). Top panels: LTT plot with GMYC single threshold time. Middle panels: likelihood surface and best solution of the GMYC model. Bottom panels: likelihood-time relationship.
Download MS_Ahrensetal_SupplementFigure3.pdf (197.9 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure4b
Downloaded 28 times
Description Supplementary Fig. 4. Match ratio of cumulative GMYC subclade analysis on empirical data (Sericini) in respect to the number of sampled species with alternative accumulation order (from bottom to top and inverse: set 1 and 2) of subclades and respective pLRT values.
Download MS_Ahrensetal_SupplementFigure4b.pdf (261.8 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure5
Downloaded 22 times
Description Supplementary Fig. 5. Comparison of the performance of the subclade’s distance-based cluster analyses (A,C,E,G,M,Q,R) with that of the complete Sericini data set (All). X-axis: threshold divergence (%), Y-axis: number of species. Blue graph – estimated species number; pink graph – number of matching species with a priori species assignments.
Download MS_Ahrensetal_SupplementFigure5.pdf (228.9 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure6
Downloaded 19 times
Description Supplementary Fig. 6. Example of the simulated trees with increased species samples distributed along with log-normal distribution of mean 5.
Download MS_Ahrensetal_SupplementFigure6.pdf (13.18 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure7
Downloaded 27 times
Description Supplementary Fig. 7. Mean match ratio for the different number of sampled species under the random, clustered and clade-wise GMYC sampling simulations for simulation schemes with constant sample size and Ne (sd=0; cross), with variable sample size but constant Ne (simple line); with variable Ne but constant sample size (square), and variable Ne and sample size (triangle) assuming a median proportion of singleton species of 13% (red), 40% (green) and 52% (orange), (sd =1, 1.5, or 2, respectively).
Download MS_Ahrensetal_SupplementFigure7.pdf (69.33 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure8
Downloaded 19 times
Description Supplementary Fig. 8. Lumping and oversplitting behavior of the GMYC model in simulations in relation to the sampling bias: ratio of GMYC vs true species compared for the different sampling schemes for constant Ne and constant sample size (sd=0).
Download MS_Ahrensetal_SupplementFigure8.pdf (143.8 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementFigure9
Downloaded 22 times
Description Supplementary Fig. 9. Relation of AIC confidence sets (GMYC) from simulations to the number of sampled species (above) and to the match ratio (GMYC entities vs. true species; below) in the framework of the different sampling schemes.
Download MS_Ahrensetal_SupplementFigure9.pdf (1.031 Mb)
Details View File Details
Title MS_Ahrensetal_SupplementTable1
Downloaded 33 times
Description Supplementary Table 1. Sampling site data as given for the localities, collection site numbers refer to plots of Supplementary Fig. 1.
Download MS_Ahrensetal_SupplementTable1.pdf (96.37 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementTable2
Downloaded 27 times
Description Supplementary Table 2. Genbank accession numbers for the data set including voucher number (*numbers without “DA” refer to the BMNH voucher codes used at the NHM London), shortcut, and locality information.
Download MS_Ahrensetal_SupplementTable2.pdf (224.6 Kb)
Details View File Details
Title MS_Ahrensetal_SupplementTable3
Downloaded 25 times
Description Supplementary Table 3. Model outputs of GMYC modeling with the empirical data: Likelihoods for null hypothesis (L0; i.e., no shift in branching rate) and GMYC (LGMYC) models, their likelihood ratio (LR) and its significance (pLRT, evaluated using a chi-square test with 3 degrees of freedom to compare GMYC and null hypothesis models), and the threshold genetic distance (T).
Download MS_Ahrensetal_SupplementTable3.pdf (72.27 Kb)
Details View File Details
Title Nexus_Files_datasets
Downloaded 16 times
Description Sericini data: full and subclade data sets
Download Nexus_Files_datasets.zip (110.8 Kb)
Details View File Details
Title Examples_simulated.trees
Downloaded 16 times
Description Examples of simulated trees for random, clustered, and clade-wise sampling
Download Examples_simulated.trees.zip (17.01 Mb)
Details View File Details
Title Code
Downloaded 14 times
Description code for simulation of gene tree within species tree with sampling. requirements: R packages "ape", "apTreeshape" SIMCOAL (SIMCOAL must be in your working directory) codes: simulation.R: an example code for simulations used in the manuscripts gene.tree.simulations.R: functions used to run SIMCOAL simulation, sample species and simulate species trees simcoal.functions.R: functions for running SIMCOAL from R sample.lineage.R: functions for sampling species from trees usage: "gene.tree.simulations.R", "simcoal.functions.R" and "sample.lineage.R" include functions used to run simulations. "simulation.R" is a code to call these functions and run a simulation. put the four R script files and SIMCOAL in a directory. Then set your working directory to it (setwd("...")). run codes in simulation.R by source("./simulation.R") or copy and paste the codes.
Download Code.zip (6.868 Kb)
Details View File Details

When using this data, please cite the original publication:

Ahrens D, Fujisawa T, Krammer H, Eberle J, Fabrizi S, Vogler AP (2016) Rarity and incomplete sampling in DNA-based species delimitation. Systematic Biology 65(3): 478-494. http://dx.doi.org/10.1093/sysbio/syw002

Additionally, please cite the Dryad data package:

Ahrens D, Fujisawa T, Krammer H, Eberle J, Fabrizi S, Vogler AP (2016) Data from: Rarity and incomplete sampling in DNA-based species delimitation. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.h6q11
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)

Search for data

Be part of Dryad

We encourage organizations to: