Data from: Genomic analysis suggests that mitonuclear coevolution proceeds over rapid timescales in the Amazonian Pipra manakin complex
Data files
Jun 04, 2025 version files 174.01 KB
-
1.Whole_Genome_Sequencing_Processing.zip
8.05 KB
-
2.Mitochondrial_Analyses.zip
17.75 KB
-
3.Pairwise_Lineage_Analyses.zip
24.20 KB
-
4.Codon_Evolution_Models.zip
16.40 KB
-
5.Supplemental_Analyses.zip
21.06 KB
-
6.Metadata_Files.zip
56.18 KB
-
README.md
30.37 KB
Abstract
Mitonuclear coevolution is defined as reciprocal selection between the nuclear and mitochondrial genomes. It is necessary to maintain compatibility between nuclear- and mitochondrially-encoded products that interact during mitochondrial processes, including mitochondrial genome replication, transcription, and translation, and oxidative phosphorylation. Theory predicts that mitonuclear coevolution may play a crucial role in the early phases of speciation by generating strong genetic incompatibilities between recently diverged taxa that have evolved unique mitochondrial-mitonuclear haplotypes. However, the timescale over which mitonuclear coevolution proceeds remains unclear, making it difficult to definitively link this process with early speciation. Here, we test for expected genomic signals of mitonuclear coevolution across the Amazonian Pipra manakin complex, which includes recently and more deeply diverged avian lineages. Using dN/dS ratio analyses, we compared signals of positive selection in mitonuclear gene categories and functionally equivalent nuclear gene categories that do not participate in mitonuclear coevolution for each pair of Pipra lineages separately and for all the lineages simultaneously. For the ribosomal protein and aminoacyl tRNA synthetase (AARS) gene categories, we identified genomic patterns consistent with stronger positive selection in mitonuclear versus nuclear genes which is suggestive of mitonuclear coevolution having occurred across the Pipra complex. Significantly, we determined that expected genomic signals of mitonuclear coevolution could be identified between lineages that diverged as recently as 0.35-0.4 MYA. This time span is in keeping with the initial stages of avian speciation and suggests that mitonuclear coevolution may operate on a timescale that would allow it to play an important role during early speciation.
Dataset DOI: 10.5061/dryad.q83bk3jv4
Description of the data and file structure
This repository is associated with: "Nikelski, E., & Weir, J. T. (2025). Genomic analysis suggests that mitonuclear coevolution proceeds over rapid timescales in the Amazonian Pipra manakin complex. (2025). Molecular Ecology. In Press.
In this study, DNA was obtained from five genetically distinct Pipra manakin lineages (Pipra filicauda, Pipra aureola, Pipra aureola borbae, Pipra fasciicauda scarlatina and Pipra fasciicauda calamae) and sequenced using whole-genome sequencing. We then extracted protein-coding sequences for mitonuclear and functionally equivalent nuclear genes for each individual and assessed signals of positive selection in these gene categories using pairwise-lineage dN/dS comparisons and whole-system codon evolution models. We determined that signals of positive selection were stronger in mitonuclear gene categories which suggests that mitonuclear coevolution has occurred uniquely within each Pipra lineage and across the Pipra manakin complex generally.
This repository contains the coding scripts and metadata files necessary to complete the analyses outlined in the above study. The files are split into six directories:
- Whole_Genome_Sequencing_Processing: This directory contains the coding files necessary to process whole-genome sequencing reads from the Pipra individuals utilized in this research project. The raw whole-genome sequencing reads are available on the NCBI SRA (BioProject: PRJNA1232061).
- Mitochondrial_Analyses: This directory contains the coding files necessary to conduct the mitochondrial genome analyses completed in this research project. In short, the coding scripts are designed to assemble a mitochondrial genome for each Pipra individual, calculate mitochondrial gene differentiation among all inter-lineage pairs of Pipra individuals and fit codon evolution models to whole-system alignments of mitochondrial genes.
- Pairwise_Lineage_Analyses: This directory contains the coding files necessary to conduct pairwise inter- and intra-lineage dN/dS ratio comparisons of mitonuclear and nuclear gene categories. In short, the coding scripts are designed to create concatenated pairwise alignments of mitonuclear and nuclear gene categories for each inter- and intra-lineage pair of Pipra individuals, calculate a dN/dS ratio for all of these alignments and then perform a bootstrapped t-test procedure to determine whether mitonuclear dN/dS ratios are significantly higher than nuclear dN/dS ratios in inter-lineage comparisons.
- Codon_Evolution_Models: This directory contains the coding files necessary to run codon evolution models on concatenated whole-system alignments of mitonuclear and nuclear gene categories. In short, the coding scripts are designed to create concatenated whole-system alignments of mitonuclear and nuclear gene categories that contain data from all Pipra individuals included in this study, fit codon evolution models to these alignments and process the outputs from these models.
- Supplemental_Analyses: This directory contains the coding files necessary to run supplemental codon evolution models described in the publication. In short, the coding scripts are designed to run codon evolution models on two subsets of the data: one where a single high genetic coverage individual represents each lineage and one where a chimeric sequence based on sequence data from all individuals within a lineage represents each lineage.
- Metadata_Files: This directory contains the metadata files necessary to complete the analyses contained within directories 1-5.
The software utilized in coding scripts are: R, RStudio, fastp, Bowtie2, Samtools, BCFtools, VCFtools, Qualimap, AGAT, seqkit, faSplit, getOrganelle, MACSE2, MEGAX, MITOS2, R2DT, Paup4, Iqtree2, vcf2phylip and PAML4. All software is open source and free to access.
Files and variables
Directory: 6.Metadata_Files.zip
File: All_Samples_Species_Codes.txt
Description: This file contains a list of values ranging from "Species00" to "Species24" that were utilized during data processing in the associated study.
File: Aureola_Bam.txt
Description: This file contains a list of pathways to the bam files associated with five Aureola individuals included in the associated study.
File: aureola_pop.txt
Description: This file contains a list of the sample names associated with the five Aureola individuals included in the associated study.
File: Aureola_Sex.txt
Description: This file contains two variables. The first variable is a list of pathways to the bam files associated with five Aureola individuals included in the associated study as they appear in the Aureola_Bam.txt file. The second variable is the sexes of each Aureola individual where "M" indicates male and "F" indicates female.
File: Borbae_Bam.txt
Description: This file contains a list of pathways to the bam files associated with five Borbae individuals included in the associated study.
File: borbae_pop.txt
Description: This file contains a list of the sample names associated with the five Borbae individuals included in the associated study.
File: Borbae_Sex.txt
Description: This file contains two variables. The first variable is a list of pathways to the bam files associated with five Borbae individuals included in the associated study as they appear in the Borbae_Bam.txt file. The second variable is the sexes of each Borbae individual where "M" indicates male and "F" indicates female.
File: Calomae_Bam.txt
Description: This file contains a list of pathways to the bam files associated with five Calamae individuals included in the associated study.
File: calomae_pop.txt
Description: This file contains a list of the sample names associated with the five Calamae individuals included in the associated study.
File: Calomae_Sex.txt
Description: This file contains two variables. The first variable is a list of pathways to the bam files associated with five Calamae individuals included in the associated study as they appear in the Calamae_Bam.txt file. The second variable is the sexes of each Calamae individual where "M" indicates male and "F" indicates female.
File: Chiroxiphia_Lancolata_Ploidy.txt
Description: This file contains information relating to the ploidy of chromosomes associated with the Chiroxiphia lancolata reference genome used in the associated study. This file is necessary for alignment of sequence reads to the reference genome.
File: Chiroxiphia_Lancolata_Translation_Table.txt
Description: This file contains information necessary to relate the names of the scaffolds within the Chiroxiphia lancolata reference genome and the chromosome associated with each of these scaffolds. This file contains two variables. The first variable is a list of scaffold names as they appear in the Chiroxiphia lancolata reference genome and the second variable is a list of chromosomes written as "chr#".
File: Combined_pipra_tree_correctly_rooted_newick_rooted.txt
Description: This file contains a rooted phylogenetic tree of twenty-five Pipra individuals in newick format. This tree is necessary to run the codon evolution models described in the associated study.
File: ETC_Complex1_SU.txt
Description: This file contains a list of nuclear genes associated with the mitonuclear ETS complex I gene category in the associated study.
File: ETC_Complex2_SU.txt
Description: This file contains a list of nuclear genes associated with the nuclear ETS complex II gene category in the associated study.
File: ETC_Complex3_SU.txt
Description: This file contains a list of nuclear genes associated with the mitonuclear ETS complex III gene category in the associated study.
File: ETC_Complex4_SU.txt
Description: This file contains a list of nuclear genes associated with the mitonuclear ETS complex IV gene category in the associated study.
File: ETC_Complex5_SU.txt
Description: This file contains a list of nuclear genes associated with the mitonuclear ETS complex V gene category in the associated study.
File: Filicauda_Bam.txt
Description: This file contains a list of pathways to the bam files associated with five Filicauda individuals included in the associated study.
File: filicauda_pop.txt
Description: This file contains a list of the sample names associated with the five Filicauda individuals included in the associated study.
File: Filicauda_Sex.txt
Description: This file contains two variables. The first variable is a list of pathways to the bam files associated with five Filicauda individuals included in the associated study as they appear in the Filicauda_Bam.txt file. The second variable is the sexes of each Filicauda individual where "M" indicates male and "F" indicates female.
File: five_individual_chimeric_tree_rooted.txt
Description: This file contains a rooted phylogenetic tree of five individuals each representing a chimeric sequence of each Pipra lineage in newick format. This tree is necessary to run the supplemental codon evolution models described in the associated study.
File: five_individual_tree_rooted.txt
Description: This file contains a rooted phylogenetic tree of five high-coverage individuals each representing one Pipra lineage* *in newick format. This tree is necessary to run the supplemental codon evolution models described in the associated study.
File: Functional_Group.txt
Description: This file contains a list of abbreviations for the nine mitonuclear and functionally equivalent nuclear gene categories investigated for signals of positive selection in the associated study.
File: Functional_Kappa_Omega_Values_Rooted_Tree_Full.txt
Description: This file contains the metadata necessary to run iterations of codon evolution models with various starting parameter values on whole-system concatenated alignments of mitonuclear and nuclear gene categories. This file is broken into three variables. The first variable consists of a list of abbreviations for the nine mitonuclear and nuclear gene categories that models were fit to. The second variable consists of a list of starting values for the Kappa parameter in a codon evolution model. The third variable consists of a list of starting values for the Omega parameter in a codon evolution model. So, using this file, each codon evolution model was specified a gene category, a Kappa starting value and a Omega starting value.
File: Illumina_PCR_FREE_Tagmentation.fasta.txt
Description: This file contains the fasta sequences of the adaptors added to whole-genome sequencing reads during library preparation that need to be trimmed as part of sequence read processing.
File: Lepidothrix_coronata_Bam.txt
Description: This file contains the pathway to a bam file for a Lepidothrix coronata individual utilized as an outgroup when constructing lineage-specific Pipra phylogenies in the associated study.
File: Lepidothrix_coronata_Sex.txt
Description: This file contains two variables. The first variable is the pathway to a bam file for a Lepidothrix coronata individual utilized as an outgroup when constructing lineage-specific Pipra phylogenies in the associated study as it appears in the Lepidothrix_coronata_Bam.txt file. The second variable is the sex Lepidothrix coronata individual where "M" indicates male and "F" indicates female.
File: LTM_Mitonuclear_Gene_Codes.txt
Description: This file contains the metadata necessary to extract mitonuclear and nuclear protein-coding sequences from whole-genome Pipra fasta files based on annotations associated with the Chiroxiphia lancolata reference genome. This file consists of three variables. The first variable is a list of mitonuclear and nuclear gene names as they appear in the Chiroxiphia lancolata reference genome annotation file. The second variable is a list of mitonuclear and nuclear gene names as they are commonly referred to in literature. The third variable is a list of the mRNA accession codes associated with the Chiroxiphia lancolata reference genome for each mitonuclear and nuclear gene.
File: Mito_Kappa_Omega_Values_Full.txt
Description: This file contains the metadata necessary to run iterations of codon evolution models with various starting parameter values on whole-system concatenated alignments of protein-coding mitochondrial genes. This file is broken into three variables. The first variable consists of a list of the thirteen mitochondrial genes that models were fit to. The second variable consists of a list of starting values for the Kappa parameter in a codon evolution model. The third variable consists of a list of starting values for the Omega parameter in a codon evolution model. So, using this file, each codon evolution model was specified a mitochondrial gene, a Kappa starting value and a Omega starting value.
File: Mitochondrial_Gene_codes.csv
Description: This file contains a list of the thirty-seven mitochondrial genes found in the mitochondrial genome of bilaterian animals in .csv format.
File: Mitochondrial_gene_codes.txt
Description: This file contains a list of the thirty-seven mitochondrial genes found in the mitochondrial genome of bilaterian animals in .txt format.
File: Mitochondrial_Protein_Coding_Gene_Codes.txt
Description: This file contains a list of the thirteen protein-coding mitochondrial genes found in the mitochondrial genome of bilaterian animals in .txt format.
File: Mitogenome_Metadata.csv
Description: This file contains the metadata associated with the mitogenomes assembled from whole-genome sequencing data for each Pipra individual. It consists of three variables named "Species", "Sample" and "Mitogenome". The variable "Species" contains a list of letters indicating the Pipra lineage (A = Aureola, B = Borbae, C = Calamae, F = Filicauda, S = Scarlatina). The variable "Sample" contains a list of the sample names associated with the twenty-five Pipra individuals included in the associated study. The variable "Mitogenome" contains the names of the mitogenomes assembled for each Pipra individuals. Some individuals possessed multiple mitogenomes that differed by a few base pairs.
File: MN_AARS2.txt
Description: This file contains a list of nuclear genes associated with the mitonuclear AARS gene category in the associated study.
File: MN_MitoRib.txt
Description: This file contains a list of nuclear genes associated with the mitonuclear ribosomal protein gene category in the associated study.
File: NUC_AARS1.txt
Description: This file contains a list of nuclear genes associated with the nuclear AARS gene category in the associated study.
File: Nuc_Rib.txt
Description: This file contains a list of nuclear genes associated with the nuclear ribosomal protein gene category in the associated study.
File: Pipra_Comparisons_Breakdown.csv
Description: This file summarizes all the inter-lineage Pipra comparisons completed in the associated study. It consists of two variables named "Species_1" and "Species_2". The variable "Species_1" contains a list of Pipra lineages representing the first member of an inter-lineage pair. The variable "Species_2" contains a list of Pipra lineages representing the second member of an inter-lineage pair.
File: Pipra_Mitogenome_List.txt
Description: This file contains the metadata associated with the finalized mitogenomes assembled for each Pipra individual included in the associated study. It consists of three variables named "Species", "Sample" and "Mitogenome". The variable "Species" contains a list of letters indicating the Pipra lineage (A = Aureola, B = Borbae, C = Calamae, F = Filicauda, S = Scarlatina). The variable "Sample" contains a list of the sample names associated with the twenty-five Pipra individuals included in the associated study. The variable "Mitogenome" contains the names of the finalized mitogenomes assembled for each Pipra individual.
File: Pipra_Paml_Comparisons_headers.txt
Description: This file contains the metadata necessary to calculate pairwise inter- and intra-lineage dN/dS ratios for all mitonuclear and nuclear gene categories for all combinations of the twenty-five Pipra individuals included in the associated study. It consists of ten variables named "Comparison", "Comparison_number", "Sample_1", "Individual_1", "Species_1", "Code_1", "Sample_2", "Individual_2", "Species_2" and "Code_2". The variable "Comparison" contains a list of the different inter- and intra-lineage comparisons completed in the associated study written as "Lineage1_Lineage2_Comparison". The variable "Comparison_number" contains a list of values that indicate the number of each comparison within its inter and intra-lineage comparison category. A total of twenty-five and ten comparisons were completed for each inter- and intra-lineage comparison category respectively. The variable "Sample_1" includes a list of names associated with the sequence data of each Pipra individual that represent the first member of each inter- or intra-lineage comparison. The variable "Individual_1" includes a list of samples names associated with each Pipra individual that represent the first member of each inter- or intra-lineage comparison. The variable "Species_1" includes a list of lineage names associated with each Pipra individual that represent the first member of each inter- or intra-lineage comparison. The variable "Code_1" includes a list of codes associated with the sequence data of each Pipra individual that represent the first member of each inter- or intra-lineage comparison. The variable "Sample_2" includes a list of names associated with the sequence data of each Pipra individual that represent the second member of each inter- or intra-lineage comparison. The variable "Individual_2" includes a list of samples names associated with each Pipra individual that represent the second member of each inter- or intra-lineage comparison. The variable "Species_2" includes a list of lineage names associated with each Pipra individual that represent the second member of each inter- or intra-lineage comparison. The variable "Code_2" includes a list of codes associated with the sequence data of each Pipra individual that represent the second member of each inter- or intra-lineage comparison.
File: Pipra_Paml_Comparisons.txt
Description: This file is identical to the file "Pipra_Paml_Comparisons_headers.txt", but it lacks a header.
File: Population_Names.txt
Description: This file contains a list of the five Pipra lineages included in the associated study.
File: SAMPLES_LIST.txt
Description: This file contains a list of sample names associated with the twenty-five Pipra individuals included in the associated study.
File: Scarlatina_Bam.txt
Description: This file contains a list of pathways to the bam files associated with five Scarlatina individuals included in the associated study.
File: scarlatina_pop.txt
Description: This file contains a list of the sample names associated with the five Scarlatina individuals included in the associated study.
File: Scarlatina_Sex.txt
Description: This file contains two variables. The first variable is a list of pathways to the bam files associated with five Scarlatina individuals included in the associated study as they appear in the Scarlatina_Bam.txt file. The second variable is the sexes of each Scarlatina individual where "M" indicates male and "F" indicates female.
Code/software
Directory: 1.Whole_Genome_Sequencing_Processing
File: 1.Pipra_Sequence_Processing.R
Description: This file contains the code necessary to process whole-genome sequencing reads from the 25 Pipra individuals included in the associated study. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The coding language used will be indicated throughout the file which can be opened with any text editing software. The general steps completed in this file include read trimming, read alignment, genotype calling, genotype filtering and mitonuclear and nuclear gene sequence isolation. The following metadata files are necessary to complete the steps contained in this code:
- Chiroxiphia_Lancolata_Translation_Table.txt,
- Chiroxiphia_Lancolata_Ploidy.txt
- Illumina_PCR_FREE_Tagmentation.fasta.txt
- SAMPLES_LIST.txt
- Aureola_Bam.txt
- Aureola_Sex.txt
- Borbae_Bam.txt
- Borbae_Sex.txt
- Calomae_Bam.txt
- Calomae_Sex.txt
- Filicauda_Bam.txt
- Filicauda_Sex.txt
- Scarlatina_Bam.txt
- Scarlatina_Sex.txt
- Population_Names.txt
- aureola_pop.txt
- borbae_pop.txt
- calomae_pop.txt
- filicauda_pop.txt
- scarlatina_pop.txt
- LTM_Mitonuclear_Gene_Codes.txt
Directory: 2.Mitochondrial_Analyses
File: 1.Pipra_Mitochondrial_Genome_Assembly_Pairwise_Analyses.R
Description: This file contains the code necessary to isolate mitochondrial genomes from the 25 Pipra individuals included in the associated study and complete pairwise mitochondrial gene differentiation analyses. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The language will be indicated throughout the file which can be opened with any text editing software. The general steps completed in this file include isolation of mitochondrial genomes from whole-genome sequencing data, annotation of mitochondrial genomes, isolation and alignment of mitochondrial gene sequences and analysis of mitochondrial gene differentiation. The following metadata files are necessary to complete the steps contained in this code:
- SAMPLES_LIST.txt
- Mitochondrial_Gene_codes.txt
- Mitochondrial_Gene_codes.txt
- Mitogenome_Metadata.csv
- Pipra_Mitogenome_List.csv
- Mitochondrial_Gene_codes.csv
File: 2.Pipra_Mitochondrial_Phylogeny_Codon_Evolution_Models.R
Description: This file contains the code necessary to run codon evolution models on alignments of 13 protein-coding mitochondrial genes that contain genetic information from the 25 Pipra individuals included in the associated study. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The language will be indicated throughout the file which can be opened with any text editing software. The general steps completed in this file include running codon evolution models on protein-coding mitochondrial gene sequences and isolating parameter values from M0, M1a, M2a, M7 and M8 models with the highest log-likelihood values. The following metadata files are necessary to complete the steps contained in this code:
- Mitochondrial_Protein_Coding_Gene_Codes.txt
- Combined_pipra_tree_correctly_rooted_newick_rooted.txt
- Mito_Kappa_Omega_Values_Full.txt
Directory: 3.Pairwise_Lineage_Analyses
File: 1.Pipra_Pairwise_DNDS_Analyses.R
Description: This file contains the code necessary to calculate a pairwise dN/dS ratio for all mitonuclear and nuclear gene categories for all inter- and intra-lineage pairs of Pipra individuals. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The language will be indicated throughout the file which can be opened with any text editing software. The general steps completed in this file include creating concatenated pairwise alignments for all gene categories for all inter- and intra- lineage pairs of 25 Pipra individuals, calculating pairwise dN/dS ratios for all of these combinations, processing raw pairwise dN/dS ratio data and consolidating all of this information into figures for publication. The following metadata files are necessary to complete the steps contained in this code:
- Pipra_Paml_Comparisons_headers.txt
- LTM_Mitonuclear_Gene_Codes.txt
- MN_AARS2.txt
- MN_MitoRib.txt
- Nuc_AARS1.txt
- Nuc_Rib.txt
- ETC_Complex1_SU.txt
- ETC_Complex2_SU.txt
- ETC_Complex3_SU.txt
- ETC_Complex4_SU.txt
- ETC_Complex5_SU.txt
- Functional_Group.txt
- Pipra_Paml_Comparisons.txt
- Pipra_Paml_Comparisons_headers.txt
File: 2.Pipra_Pairwise_DNDS_Analyses_Bootstrapped_Ttest.R
Description: This file contains the code necessary to run bootstrapped t-tests comparing the mean pairwise dN/dS ratios of functionally equivalent mitonuclear and nuclear gene categories for all inter-lineage comparisons of Pipra. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The language will be indicated throughout the file which can be opened with any text editing software. The general steps completed in this file include running a series of bootstrapped t-tests for each gene category for each inter-lineage comparison, consolidating the results from these bootstrapped t-tests and determining significance. The following metadata files are necessary to complete the steps contained in this code:
- Pipra_Paml_Comparisons_headers.txt
- Pipra_Comparsions_Breakdown.csv
Directory: 4.Codon_Evolution_Models
File: 1.Pipra_Phylogeny_Codon_Evolution_Models.R
Description: This file contains the code necessary to create whole-system concatenated alignments for mitonuclear and nuclear gene categories containing genetic information from the 25 Pipra individuals included in the associated study and to run codon evolution models on this data. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The language will be indicated throughout the file which can be opened with any text editing software.The general steps completed in this file include creating a phylogeny of all 25 Pipra samples, creating whole-system alignments of mitonuclear and nuclear genes, concatenating alignments to make whole-system concatenated gene category alignments, running codon evolution models on concatenated alignments and isolating parameter values from M0, M1a, M2a, M7 and M8 models with the highest log-likelihood values. The following metadata files are necessary to complete the steps contained in this code:
- Lepidothrix_coronata_Bam.txt
- Lepidothrix_coronata_Sex.txt
- Chiroxiphia_Lancolata_Ploidy.txt
- Combined_pipra_tree_correctly_rooted_newick_rooted.txt
- LTM_Mitonuclear_Gene_Codes.txt
- All_Samples_Species_Codes.txt
- MN_AARS2.txt
- MN_MitoRib.txt
- Nuc_AARS1.txt
- Nuc_Rib.txt
- ETC_Complex1_SU.txt
- ETC_Complex2_SU.txt
- ETC_Complex3_SU.txt
- ETC_Complex4_SU.txt
- ETC_Complex5_SU.txt
- SAMPLES_LIST.txt
- Functional_Kappa_Omega_Values_Rooted_Tree_Full.txt
Directory: 5.Supplemental_Analyses
File: 1.Pipra_Phylogeny_Codon_Evolution_Models_5_High_Coverage_Individuals.R
Description: This file contains the code necessary to create concatenated alignments for mitonuclear and nuclear gene categories that include data from the five Pipra samples (one from each lineage) with the highest genetic coverage and to run codon evolution models on this data. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The language will be indicated throughout the file which can be opened with any text editing software. The general steps completed in this file include creating a five individual phylogeny, creating concatenated alignments for mitonuclear and nuclear gene categories based on sequence data from high-coverage samples, running codon evolution models on this subset of the data and isolating parameter values from M0, M1a, M2a, M7 and M8 models with the highest log-likelihood values. The following metadata files are necessary to complete the steps contained in this code:
- five_individual_tree_rooted.txt
- Functional_Kappa_Omega_Values_Rooted_Tree_Full.txt
File: 2.Pipra_Phylogeny_Codon_Evolution_Models_5_Chimeric_Individuals.R
Description: This file contains the code necessary to create concatenated alignments for mitonuclear and nuclear gene categories that include data from chimeric sequences created for each Pipra lineage (one sequence per lineage) and to run codon evolution models on this data. The majority of the scripts contained in this file are written in R, but some command-line linux commands may be included where using this language is more convenient. The language will be indicated throughout the file which can be opened with any text editing software. The general steps completed in this file include creating chimeric concatenated gene sequences for each Pipra lineage for each gene category, running codon evolution models based on this subset of the data and isolating parameter values from M0, M1a, M2a, M7 and M8 models with the highest log-likelihood values. The following metadata files are necessary to complete the steps contained in this code:
- five_individual_chimeric_tree_rooted.txt
- Functional_Kappa_Omega_Values_Rooted_Tree_Full.txt
The methods employed in this study can be found in the linked 2025 Molecular Ecology publication with additional information specified in the provided scripts and data files. To summarize, DNA was extracted from twenty-five Pipra manakin tissue samples and sequenced using a whole-genome sequencing methodology. The produced genetic data was processed and then mitonuclear and nuclear protein-coding gene sequences were obtained for each individual. Pairwise dN/dS comparisons and maximum likelihood codon evolution models were employed to assess positive selection in mitonuclear and functionally equivalent nuclear gene categories where stronger positive selection associated with mitonuclear gene categories is a putative signal of mitonuclear coevolution having occurred in the system. The scripts necessary to complete these analyses are included in this repository along with all required metatdata files.
