Skip to main content
Dryad

Data from: The propagation of admixture-derived adaptive radiation potential

Cite this dataset

Kagawa, Kotaro; Seehausen, Ole (2020). Data from: The propagation of admixture-derived adaptive radiation potential [Dataset]. Dryad. https://doi.org/10.5061/dryad.nzs7h44pv

Abstract

Adaptive radiations frequently show remarkable repeatability where single lineages undergo multiple independent episodes of adaptive radiation in distant places and long separate timepoints. Increasing evidence suggests that genetic variation generated through hybridization between distantly related lineages can promote adaptive radiation. This mechanism, however, requires rare coincidence in space and time between the hybridization event and opening of ecological opportunity, because hybridization generates large genetic variation only in the site where it occurred and the elevated genetic variation will persist only for a short period. Hence, hybridization seems unlikely to explain recurrent adaptive radiation in the same lineage. Contrary to these expectations, our evolutionary computer simulations demonstrate that admixture variation can geographically spread and persist for long periods if certain conditions are present such that ecological and/or geographic mechanisms split the hybrid population into isolated sub-lineages. Subsequent secondary hybridization of some of these can reestablish genetic polymorphisms from the ancestral hybridization in places far from the birthplace of the hybrid-clade and long after the ancestral hybridization event. Consequently, simulations revealed conditions where exceptional genetic variation, once generated through a rare and unlikely hybridization event, can facilitate multiple adaptive radiations exploiting ecological opportunities available at distant points in time and space.

Methods

Source code of the individual-based computer simulation model and final output files from simulations.

Usage notes

Codes and data for "The propagation of admixture-derived adaptive radiation potential" by Kotaro Kagawa & Ole Seehausen.

Contents:
(1) Java source code files for the individual-based simulation (directory: IBM/model/src)
(2) Input files for the simulation program (directory: IBM/IniPop; IBM/model/configs)
(3) R codes for analysis and visualization of simulation results (directory: Rcodes)
(4) Final output files from simulations for each figure in the main text and the supplementary information (directory: SimulationResults)
(5) Examples of raw simulation output files for illustrating how R codes works (directory: Example_RawSimulationOutput)

Instruction for running the Java program for individual-based simulation.
All Java source code files for the individual-based simulation are in the directory IBM/model/src. Prior to run the simulation, all java source codes files (Functions.java, MersenneTwisterFast.java, Organism.java, Parameters.java, Population.java, Run_AfterHybrid.java, Run_BeforeHybrid.java, and Simulation.java) must be compiled with the Java Development Kit (JDK) or Open Java Development Kit (Open JDK) (Note: Organism.java should be compiled first. Then, by compiling Run_BeforeHybrid.java and Run_AfterHybrid.java, all codes will be compiled). Simulation of allopatric genomic evolution of two parental lineages before hybridization can be run by executing Run_BeforeHybrid. The directory IBM/IniPop contains output files from simulations of allopatric evolution of parental lineages (30 replications). These files are required as input files for simulations of hybridization between parental lineages and evolutionary dynamics that follows, which can be run by executing Run_AfterHybrid. Conditions for simulations, including simulation scenario (the spatially repeated AR scenario/ the temporally repeated AR scenario) and other parameter values, are controlled by text files in the directory IBM/model/configs. The simulation program automatically extracts parameter values from files in this directory (the hierarchical structure of directories should not be changed). The directory IBM/model contains configs to run simulations for each figure in the paper. For example, the directory IBM/configs_SpatiallyRepAR_2Corridors(Fig2,3,S3-7) contains input files for simulating the spatially repeated AR scenario with two isolated corridors without environmental heterogeneity. Simulations under the same conditions as the Fig.3 can be run by replacing all files in the directory IBM/model/configs with files in this directory. When Run_AfterHybrid is executed, the program runs simulations with all combinations of parameter values listed in IBM/configs/parameter_tables/parameter_table.txt. Simulations with parameter combinations shown in figures can be run by replacing the parameter_table.txt with a file for the corresponding parameter combinations; for example, parameter combinations for Fig. 3 are listed in the parameter_table_Fig3,S3,S6. The simulation program automatically generates a folder named Results containing many text files, which are raw output files of the simulation. These files will be input file for the R codes for analyzing simulation results. The directory Example_RawSimulationOutput contains raw simulation output files from 5 simulation replications under a subset of simulation conditions that we explored (these files are intermediate files to be processed by R codes, and we do not upload raw outputs of our all simulation runs because total size of these files is much bigger than the limit size for single data set that the repository accepts).

Instruction for running R codes
Species_Count.R is the code to analyze raw output files of the simulation in the directory Results which is automatically generated by the simulation program. The analysis includes counting of the number of phenotypically distinct and reproductively isolated incipient species and the number of genetically distinct clusters of individuals. During the analysis, the program calls functions in Species_Count_Function.R and GeneticCluster_Count_Function.R. The program performs analysis for all simulation results in the directory Results and produces a single file named Results_Integrated.txt, which aggregates results of all simulation runs. Plot_ParameterEffects.R is the code to visualize results of many simulation runs with systematically varied parameter values (e.g. Figs. 3 and 5) that are summarized in Results_Integrated.txt. The directory SimulationResults contains final output files, Results_Integrated.txt, from all our simulation runs. Plot_EvoDynamics.R is the code to visualize evolutionary dynamics of single simulation runs (e.g. Figs. 2 and 4). The code requires populations_x_.txt (x = 0, 1, 2, ..., n - 1) in the directory Results as an input file. Histgram_Parental_Mutations.R is the code to produce histograms of phenotypic effect values of derived alleles that have fixed in genome of each of two parental lineages before they hybridize (Fig. S1b, d).