Gene duplication captures morph-specific promoter usage in the evolution of aphid wing dimorphisms
Data files
Feb 12, 2025 version files 94.87 MB
-
Data_S1_-_pisum_rnaseq_DEgenes_byinstar_andsex.csv
81.37 KB
-
Data_S2_-_pisum_rnaseq_normalized_counts.tsv
8.37 MB
-
Data_S3_-_follistatin_gene_tree_alignment.fasta
58.88 KB
-
Data_S4_-_v3_XChrWithApi-wlinsert-scaffold.fa
109.31 KB
-
Data_S5_-_HS2_v3annotation_edited_wFs3_fixed.gtf
64.20 MB
-
Data_S6_-_foxglove_flye_fs_contigs.fa
21.90 MB
-
Data_S7_-_church_fsContigs_chur.fa
124.94 KB
-
README.md
3.47 KB
-
Table_S1_-_genomic_intervals_of_follistatins_used.xlsx
11.06 KB
-
Table_S2_-_accessions_of_genomic_files.xlsx
10.53 KB
-
Table_S3_-_rnaseq_accessions.csv
705 B
Abstract
Understanding how morphology evolves requires identifying the types of mutations that contribute to changes in development. We integrated comparative genomics and transcriptomics to reconstruct the evolution and regulation of follistatin paralogs in relation to the evolution of aphid-winged and wingless morphs. We discover that different pea aphid follistatin duplicates play an essential molecular role in both the male and female wing dimorphisms, linking the genetic and environmental control of morph determination in each sex, respectively. We also find that an ancestral follistatin gene likely had multiple promoters and that the follistatin duplicates that evolved wingless-specific expression retained only the ancestral wingless-specific promoter. Our work provides a roadmap for how alternative promoter usage and subsequent gene duplication can enable the evolution of animal form.
https://doi.org/10.5061/dryad.cnp5hqcf3
Description of the data and file structure
These data include the RNA-Seq output files like the differentially expressed genes by instar and sex and normalized read counts. They also include final output files for genome annotations, corrected scaffolds, and follistatin tree alignment files.
Files and variables
File: Data_S1_-_pisum_rnaseq_DEgenes_byinstar_andsex.csv
Description:, Gene list of differentially expressed genes sorted from DESeq2 analysis.
Variables
- sex: sex denotes female or male samples
- instar comparison: instar stages being compared. numbers correspond to stages being compared, e.g., "l21" is the stage 2 to stage 1 comparison, with the winged/wingless morph being contrasted
- genes: gives the annotation ID from the reference annotation file
File: Data_S2_-_pisum_rnaseq_normalized_counts.tsv
Description: Output normalized read counts for pea aphid RNA-Seq data from the DESeq2 pipeline.
File: Data_S3_-_follistatin_gene_tree_alignment.fasta
Description: Alignment file for the follistatin paralogs gene tree coding region.
File: Data_S4_-_v3_XChrWithApi-wlinsert-scaffold.fa
Description: Correction of the genome annotation for the api region on the X chromosome.
File: Data_S5_-_HS2_v3annotation_edited_wFs3_fixed.gtf
Description: Modified genome annotation of v3 pea aphid reference genome with the scaffold from DataS4
File: Data_S6_-_foxglove_flye_fs_contigs.fa
Description: Contig assemblies of Aulacorthum solani for the follistatin paralogs.
File: Data_S7_-_church_fsContigs_chur.fa
Description: Contig assemblies of Acyrthosiphon churchillense for the follistatin paralogs.
File: Table_S1_-_genomic_intervals_of_follistatins_used.xlsx
Description: A reference to the regions of available genomes and the intervals where follistatin paralogs were identified.
Variables
- Species corresponds to aphid species used
- Region denotes paralog
- Genome source names the reference/source sequence
- Scaffold specifies the name of the scaffold in the annotation
- Start/End gives the interval on that scaffold
- Notes include peripheral comments
File: Table_S2_-_accessions_of_genomic_files.xlsx
Description: Information of source genomic files used in comparative analyses
Variables
- Species gives the Latin aphid species name
- Line gives the strain name when available
- Type gives the sequencing source
- HybPiper specifies whether the sequence was used as "queries" or "target"
- Source gives the source of the genomic files with the NCBI SRA accession when available
File: Table_S3_-_rnaseq_accessions.xlsx
Description: Specifies the source of RNA-Seq data from alternative splicing analysis
Variables
- Species describes the aphid species
- Line is the strain when specified
- Source denotes where the data came from
- Dev Stages include the nymphal states of the data
- Sex specifies which samples were available
- Wing Morphs correspond to winged and wingless data
- replicates specify study design
- total samples gives the number of files available
- Notes adds additional comments
Access information
NA
