Data for: Gene birth in a model of nongenic adaptation
Abstract
Over evolutionary timescales, genomic loci switch between functional and nonfunctional states through processes such as pseudogenization and de novo gene birth. Here we ask about the likelihood and rate of functionalization of nonfunctional loci. We simulate an evolutionary model to look at the contributions of mutations and structural variation using biologically reasonable distributions of mutational effects. We find that a wide range of mutational effects are conducive to functionalization, thus indicating the ubiquity of this process. During functionalization, loci transition from a mutation dominated ’learning’ phase to a selectiondominated adaptation phase. Interestingly, in the special case of de novo gene birth, whereby nonfunctional loci begin to express a functional product, we find that expression level changes lead to rare, extreme jumps in fitness, whereas sustained adaptation is driven by product functionality. Our work supports the idea that the potential for adaptation is spread widely across the genome, and our results offer mechanistic insights into the process of de novo gene birth.
README: Data for: Gene birth in a model of nongenic adaptation
https://doi.org/10.5061/dryad.fbg79cnxx
The below files contain codes used for data generation and analysis and processed data that were used in the associated manuscript titled 'Gene birth in a model of nongenic adaptation'.
All codes are written in Python 3.10.
codes (.py files):
 'model2_March2022.py' can be used to generate raw data for high mutation rates
 ‘analyse_model2.py’ can be used to organize raw data produced by model2_March2022.py (outputs df_genebirth.pickle) into dataframes and to generate plots in the associated manuscript using this data.
 'model3_May2023.py' can be used to generate raw data for low mutation rates
 'analyse_model3.py' can be used to organize raw data generated by 'model3_May2023.py' (outputs lowmut_df_genebirth.pickle).
 'get_manuscript_Fig3_4.py' can be used to generate plots for data processed with 'analyse_model3.py'
 'pop_dyn_btw_mutations.py' can be used to estimate mutant fixation probabilities between model timesteps
parameter value lists (.pickle files):
 'model3_param_all.pickle': each row corresponds to parameter values for 'model3_May2023.py'
 'estimate_pfix_paramlist.pickle': each row corresponds to parameter values for 'pop_dyn_btw_mutations.py'
Data (.pickle files):
 'fitness_trajectories_0.001_0.01_0.75_0.3_0_1000_1000.pickle': raw data (python dictionary) for Chlamy parameters run with high mutation rate
 'lomut_fitness_trajectories_0.001_0.01_0.75_0.3_0_1000_1000.pickle': raw data (python dictionary) for Chlamy parameters run with low mutation rate
 'exp_adapt_powerlaw_.001_0.01_0.75_0.3_0_1000_1000.pickle’: python dictionary that contains expression level and adaptive value trajectories for all individuals in simulated populations with Chlamy parameters and high mutation rate.
 'df_concat.pickle': concatenated processed raw data which is an input for 'analyse_model2.py'
 'df_genebirth.pickle': data from analysis of processed data generated by 'analyse_model2.py'.
 'lomut_df_concat.pickle': concatenated processed raw data which is an input for 'analyse_model3.py'
 'lomut_df_genebirth.pickle': data from analysis of processed data generated by 'analyse_model3.py'.
Methods
The data was generated in simulations written in python.
Usage notes
The data is in Python's pickle format.