Evolution of novel mimicry polymorphisms through Haldane’s sieve and rare recombination
Data files
Apr 29, 2025 version files 124.98 MB
-
README.md
11.58 KB
-
SequenceAlignmentsAndTime-calibratedTrees.zip
124.97 MB
Abstract
Origins of phenotypic novelty represent a paradox. Maintenance of distinct, canalized morphs usually requires a complex array of polymorphisms, whose co-retention requires a genetic architecture resistant to recombination, involving inversions and master regulators. Here, we reveal how such a constraining architecture can still accommodate novel morphs in evolving polymorphisms using the classic polymorphic Batesian mimicry in Papilio polytes , whose supergene-like genetic architecture is maintained in a large inversion. We show that rapidly evolving alleles of the conserved gene, doublesex , within this inversion underlie the genetic basis of this polymorphism. Using precisely dated phylogeny and breeding experiments, we show that novel adaptive mimetic morphs and underlying alleles evolved in a sequentially dominant manner, undergoing selective sweeps in the mimetic species as predicted under Haldane’s sieve. Furthermore, we discovered that mimetic forms share precise inversion breakpoints, allowing rare exon swaps between the universally dominant and a recessive allele to produce a novel, persistent intermediate phenotype, ultimately facilitating the acquisition of phenotypic novelty. Thus, genetic dominance, selective sweeps, rapid molecular divergence, and rare recombination promote novel forms in this iconic evolving polymorphism, resolving the paradox of phenotypic novelty arising even in highly constrained genetic architectures.
Dataset DOI: 10.5061/dryad.sqv9s4nf4
Corresponding author information
Name: Krushnamegh Kunte
ORCID: https://orcid.org/0000-0002-3860-6118
Affiliation: National Centre for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bengaluru 560065, India
email: krushnamegh@ncbs.res.in
Alternative contact information
Name: Riddhi Deshmukh
ORCID: https://orcid.org/0000-0002-7634-2029
Affiliation: Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
email: riddhi.deshmukh@unil.ch
Description of the data and file structure
The folder contains alignment files, partition files, and tree files for the two doublesex phylogenies reconstructed with coding (dsx_CDS_phylogeny folder) and non-coding sequences (dsx_non-coding_phylogeny folder) in Fig. S6, and the time-calibrated phylogeny (Time-calibration folder) used to date the splits between species in Fig. S2. The two dsx folders contain:
1.raw: alignment for each exon or intronic region
2.merged: alignments for the concatenated sequences
3.partition: files used to choose the best partition model for each dataset
4.MrBayes: files for each phylogeny run.
We removed mimetic heterozygotes in the phylogeny reconstructed from non-coding sequence of dsx to ease phylogenetic convergence. We have provided detailed files for both runs. All the data used here was extracted from whole-genome sequencing data for each sequenced individual (raw data available on NCBI SRA database PRJNA1166847, PRJNA234541 and PRJNA396246). The time-calibrated tree in Fig. S2 was constructed using BEAST v2.7.4. We mined published sequences to extract four mitochondrial markers (cytochrome c oxidase subunit-I, tRNA leucine, cytochrome c oxidase subunit-II and 16S) and two nuclear markers (elongation factor-I alpha and wingless) for each species in the Menelaides group and outgroups. The alignment file is provided in nexus format in addition to the .xml file which contains the fossil calibration and other run parameters required to reproduce the run in BEAST.
Usage Notes
- Sequence and alignment files are organized as .fa or .fas files. These are standard FASTA files. The alignment used for time-calibration is in .nex (NEXUS) format that can be accessed with most alignment viewing tools such as MEGA and Geneious.
- The partition file in each folder is called "best_scheme.txt" and contains the partition models in standard nexus format, which can be read by most phylogenetic tools, including RAxML and MrBayes.
- MrBayes runs and the intermediate files for each MCMC run are in the standard nexus format as well.
- The .xml file for the time-calibration run in BEAST is provided and would be enough to replicate the run.
- .tre files can be visualized in tools such as FigTree (open source)
Folder structure:
SequenceAlignmentsAndTime-calibratedTrees
|
+---dsx_CDS_phylogeny
| +---1.raw
| | exon1.fas
| | exon2.fas
| | exon3.fas
| | exon4.fas
| | exon5.fas
| +---2.merge
| | merged.fas
| | merged2.nexus
| | merged2.nexus.ckp
| | merged2.nexus.ckp~
| | merged2.nexus.mcmc
| | merged2.nexus.run1.p
| | merged2.nexus.run1.t
| | merged2.nexus.run2.p
| | merged2.nexus.run2.t
| | merged_new.fas
| +---3.partition_finder
| | | Arlequin_log.txt
| | | log.txt
| | | merged.arp
| | | merged.fas
| | | merged.phy
| | | names.txt
| | | partition_finder.cfg
| | | partition_finder.sh
| | | randseed.txt
| | +---analysis
| | | | best_scheme.txt
| | | +---cfg
| | | | oldcfg.bin
| | | +---phylofiles
| | | +---schemes
| | | | scheme_data.csv
| | | | start_scheme.txt
| | | | step_1.txt
| | | | step_10.txt
| | | | step_11.txt
| | | | step_12.txt
| | | | step_2.txt
| | | | step_3.txt
| | | | step_4.txt
| | | | step_5.txt
| | | | step_6.txt
| | | | step_7.txt
| | | | step_8.txt
| | | | step_9.txt
| | | +---start_tree
| | | | filtered_source.phy
| | | | filtered_source.phy.reduced
| | | | filtered_source.phy_phyml_tree.txt
| | | | partitions.txt
| | | | partitions.txt.reduced
| | | | RAxML_binaryModelParameters.BLTREE
| | | | RAxML_binaryModelParameters.fastTREE
| | | | RAxML_fastTree.fastTREE
| | | | RAxML_info.BLTREE
| | | | RAxML_info.fastTREE
| | | | RAxML_log.BLTREE
| | | | RAxML_parsimonyTree.fastTREE
| | | | RAxML_result.BLTREE
| | | | source.phy
| | | +---subsets
| | | 214766f5e405ef5c397769eb1fbf74bb.txt
| | | bb4ac91cef178dff5b6c93c25fefd520.txt
| | | data.db
| | | ee576ff1b4b87ec893d37b7123155504.txt
| | +---merged.res
| | Arlequin_log.txt
| | merged.htm
| | merged.js
| | merged_main.htm
| | merged_tree.htm
| +---4.mrbayes
| coding_dsx_modified.tree.pdf
| coding_dsx_unmodified.tree.pdf
| config.nex
| dsx_CDS_trees
| merged.nex
| merged.nex.ckp
| merged.nex.ckp~
| merged.nex.con.tre
| merged.nex.lstat
| merged.nex.mcmc
| merged.nex.parts
| merged.nex.pstat
| merged.nex.run1.p
| merged.nex.run1.t
| merged.nex.run2.p
| merged.nex.run2.t
| merged.nex.trprobs
| merged.nex.tstat
| merged.nex.vstat
| merged.phy
| Nexus_pop_art.log
| Nexus_pop_art.nex
| test.tree
| test_lab
| test_modified.tree
| Traits_pop_art.csv
| Traits_pop_art_updated.csv
+---dsx_non-coding_phylogeny
| +---without_mimetic_heterozygotes
| | +---4.mrbayes
| | config.nex
| | merged.fas
| | merged.nex
| | merged.nex.ckp
| | merged.nex.ckp~
| | merged.nex.con.tre
| | merged.nex.con.tre.pdf
| | merged.nex.lstat
| | merged.nex.mcmc
| | merged.nex.parts
| | merged.nex.pstat
| | merged.nex.run1.p
| | merged.nex.run1.t
| | merged.nex.run2.p
| | merged.nex.run2.t
| | merged.nex.trprobs
| | merged.nex.tstat
| | merged.nex.vstat
| | merged.phy
| | names.txt
| +---with_mimetic_heterozygotes
| +---1.aligned_regions
| | align.sh
| | region1_aligned.fas.best.fas
| | region2_aligned.fas.best.fas
| | region3_aligned.fas.best.fas
| | region4_aligned.fas.best.fas
| | region5_aligned.fas.best.fas
| | region6_aligned.fas.best.fas
| | region7_aligned.fas.best.fas
| | region8_aligned.fas.best.fas
| | region9_aligned.fas.best.fas
| | region10_aligned.fas.best.fas
| | region11_aligned.fas.best.fas
| | region12_aligned.fas.best.fas
| | region13_aligned.fas.best.fas
| | region14_aligned.fas.best.fas
| | region15_aligned.fas.best.fas
| | region16_aligned.fas.best.fas
| | region17_aligned.fas.best.fas
| | region18_aligned.fas.best.fas
| | region19_aligned.fas.best.fas
| | region20_aligned.fas.best.fas
| | region21_aligned.fas.best.fas
| | region22_aligned.fas.best.fas
| | region23_aligned.fas.best.fas
| | region24_aligned.fas.best.fas
| | region25_aligned.fas.best.fas
| | region26_aligned.fas.best.fas
| | region27_aligned.fas.best.fas
| | region28_aligned.fas.best.fas
| | region29_aligned.fas.best.fas
| | region30_aligned.fas.best.fas
| | region31_aligned.fas.best.fas
| | region32_aligned.fas.best.fas
| | region33_aligned.fas.best.fas
| | region34_aligned.fas.best.fas
| | region35_aligned.fas.best.fas
| | region36_aligned.fas.best.fas
| | region37_aligned.fas.best.fas
| | region38_aligned.fas.best.fas
| | region39_aligned.fas.best.fas
| | region40_aligned.fas.best.fas
| | region41_aligned.fas.best.fas
| | region42_aligned.fas.best.fas
| | region43_aligned.fas.best.fas
| | region44_aligned.fas.best.fas
| | region45_aligned.fas.best.fas
| | region46_aligned.fas.best.fas
| | region47_aligned.fas.best.fas
| +---2.merged
| | merged.fas
| +---3.partition_finder
| | | log.txt
| | | merged.fas
| | | merged.phy
| | | names.txt
| | | partition_finder.cfg
| | | partition_finder.sh
| | +---analysis
| | | best_scheme.txt
| | +---cfg
| | | oldcfg.bin
| | +---phylofiles
| | +---schemes
| | | scheme_data.csv
| | | start_scheme.txt
| | +---start_tree
| | | filtered_source.phy
| | | filtered_source.phy.reduced
| | | filtered_source.phy_phyml_tree.txt
| | | partitions.txt
| | | partitions.txt.reduced
| | | RAxML_binaryModelParameters.BLTREE
| | | RAxML_binaryModelParameters.fastTREE
| | | RAxML_fastTree.fastTREE
| | | RAxML_info.BLTREE
| | | RAxML_info.fastTREE
| | | RAxML_log.BLTREE
| | | RAxML_parsimonyTree.fastTREE
| | | RAxML_result.BLTREE
| | | source.phy
| | +---subsets
| | 323665c1d553ef594406662910272ed8.txt
| | data.db
| +---4.mrbayes
| config.nex
| merged.fas
| merged.nex
| merged.nex.ckp
| merged.nex.ckp~
| merged.nex.con.tre
| merged.nex.lstat
| merged.nex.mcmc
| merged.nex.parts
| merged.nex.pstat
| merged.nex.run1.p
| merged.nex.run1.t
| merged.nex.run2.p
| merged.nex.run2.t
| merged.nex.trprobs
| merged.nex.tstat
| merged.nex.vstat
| merged.phy
| names.txt
+---Time-calibration
MCC_HaldanesSieve_AthulyaGirishK_2023-06-16.contree
MenelaidesConcatenatedAlignment_expanded.nex
MenelaidesConcatenatedAlignment_expanded_operatorsModified2.xml
Access information
The raw whole genome sequencing data is available on the NCBI-SRA database (accession numbers PRJNA1166847, PRJNA234541 and PRJNA396246).
