Phylogenomic analyses in Phrymaceae reveal extensive gene tree discordance in relationships among major clades
Data files
Apr 28, 2022 version files 1.21 GB
Abstract
• Premise of the study: Phylogenomic datasets using genomes and transcriptomes provide rich opportunities beyond resolving bifurcating phylogenetic relationships. Monkeyflower (Phrymaceae) is a model system for evolutionary ecology. However, it lacks a well-supported phylogeny for a stable taxonomy and for macroevolutionary comparisons.
• Methods: We sampled 24 genomes and transcriptomes in Phrymaceae and closely related families, including eight newly sequenced transcriptomes. We reconstructed the phylogeny using IQ-TREE and ASTRAL, evaluated gene tree discordance using PhyParts, Quartet Sampling, and cloudogram, and carried out phylogenetic network analyses using PhyloNet and HyDe. We searched for whole genome duplication (WGD) events using chromosome numbers, synonymous distance, and gene duplication events.
• Key results: Most gene trees support the monophyly of Phrymaceae and each of its tribes. Most gene trees also support the tribe Mimuleae being sister to Phrymeae + Diplaceae + Leucocarpeae, with extensive gene tree discordance among the latter three. Despite the discordance, polyphyly of Mimulus s.l. is strongly supported, and no particular reticulation event among the Phrymaceae tribes is well supported. Reticulation likely occurred among Erythranthe bicolor and close relatives. No ancient WGD event was detected in Phrymaceae. Instead, small-scale duplications are among potential drivers of macroevolutionary diversification of Phrymaceae.
• Conclusions: We show that analysis of reticulate evolution is sensitive to taxon sampling and methods used. We also demonstrate that genome-scale data do not always fully “resolve” phylogenetic relationships. They present rich opportunities to investigate reticulate evolution, and gene and genome evolution involved in lineage diversification and adaptation.
Usage notes
DATA PACKAGE FROM MORALES-BRIONES ET AL (American Journal of Botany)
Phylogenomic analyses using genomes and transcriptomes do not “resolve” relationships among major clades in Phrymaceae
This package contains the data and software outputs (i.e. fasta files, alignments, trees, etc).
1_transcriptomes_and_genomes:
final_filtered_transcriptomes: CDS and PEP Fasta files of filtered transcriptomes
genomes: Fasta files of CDS used genomes
original_transcriptome_assemblies: Fasta files of original Trinity transcriptome assemblies
2_final_homologs: Tree files of final homologs in newick format
3_MO_orthologs: Tree files of final MO orthologs in newick format
4_MO_fasta_files_24tx_aln: Alignment and clean Alignments in fasta format from MO orthologs.
5_concatenated_matrices: Concatenated alignment from 4_MO_fasta_files_24tx_aln in fasta, nexus and phylyip formats.
6_phylogenetic_analyses
ASTRAL: Input and output files from ASTRAL
IQtree_concatenated: Input and output files from IQ-Tree for the concatenated alignment.
IQtree_invidivual_gene_trees: Input and output files from IQ-Tree for the individual MO ortholog alignments.
IQtree_invidivual_gene_trees_rooted: Rooted trees in newick format from IQtree_invidivual_gene_trees
Phyparts: Input and output files Phyparts
QS: Input and output files from Quartet Sampling
cpDNA: Analises of the chloroplast data set.
IQtree: Input and output files from IQ-Tree for the contiguous cpDNA alignment.
QS: Input and output files from Quartet Sampling
aln: Contiguous cpDNA alignment
assemblies: cpDNA assemblies using FastPlast
full_assemblies: Complete assemblies from 7 species.
partial_assemblies: Partial assemblies from 7 species.
hybridization_test: Hybridization analyses
hyde: Input and output files from HyDe analyses.
phylonet: Input and output files from PhyloNet analyses.
mcmc_gt: Input and output files from MCMC_GT analyses.
ml: Input and output files from InferNetworks_ML analyses.
map_dup: Input and output files from the orthogroup mapping.
treepl: Input and output files from individual time-calibrated ortholog trees for cloudogram
wgd: Input and output files from analyses of distribution of Synonymous Distance among Gene Pairs (Ks plots)
If you have any question about the data, please do not hesitate to contact Diego F. Morales-Briones at dfmoralesb@gmail.com