Code and sequence data pertaining to: A phylogenomic perspective on interspecific competition
Data files
Dec 13, 2023 version files 31.22 MB
-
Gene_Trees.tar.gz
-
README.md
-
Single_Copy_Orthologue_Sequences.tar.gz
Abstract
Evolutionary processes may have substantial impacts on community assembly, but evidence for phylogenetic relatedness as a determinant of interspecific interaction strength remains mixed. In this perspective, we consider a possible role for discordance between gene trees and species trees in the interpretation of phylogenetic signal in studies of community ecology. Modern genomic data show that the evolutionary histories of many taxa are better described by a patchwork of histories that vary along the genome rather than a single species tree. If a subset of genomic loci harbor trait-related genetic variation, then the phylogeny at these loci may be more informative of interspecific trait differences than the genome background. We develop a simple method to detect loci harboring phylogenetic signal and demonstrate its application through a proof of principle analysis of Penicillium genomes and pairwise interaction strength. Our results show that phylogenetic signal that may be masked genome-wide could be detectable using phylogenomic techniques and may provide a window into the genetic basis for interspecific interactions.
README: Code and sequence data pertaining to: A phylogenomic perspective on interspecific competition
https://doi.org/10.5061/dryad.sj3tx96bd
Directory Structure
The top-level directory is ILSSimsDryad and is deposited through Zenodo here. The two data directories are also deposited on Dryad.
The folder ILSSimsDryad/src contains the source code for our simulations.
The folder ILSSimsDryad/obsData contains the observed data (sequences, gene trees, and competition coefficients) for our study.
The folder ILSSimsDryad/conceptPlot contains the scripts for recreating our figures.
See below for more detailed descriptions of each folder. The instructions below assume that the user is already inside the top-level directory, ILSSimsDryad.
CODE
This software implements analyses in Louw et al 2023, "A phylogenomic perspective on interspecific competition", forthcoming in Ecology Letters. The bioRxiv preprint of the corresponding manuscript is available at https://www.biorxiv.org/content/10.1101/2023.05.11.540388v3
The code requires DendroPy to run most of the analyses: https://dendropy.org/
The source code is available in the src folder. You can install it by running "python setup.py install --user" from the command line.
Once you've installed the source, the code in the scripts folder implements the analyses in the figures. The visualizations were made using ggplot and code for these is available in the conceptPlot folder. To rerun all the analyses, uncomment the lines in scripts/run_disc_sims.sh and rerun all the code.
DATA
Gene tree files for the eight penicillium species are stored in obsData/Gene_Trees
Sequence data corresponding to these trees is available in obsData/Single_Copy_Orthologue_Sequences. The files in the subfolder obsData/Single_Copy_Orthologue_Sequences/Single_Copy_Spec_Labels remove the sequence ID and contain only species identifiers, but are otherwise the same as the fasta files in obsData/Single_Copy_Orthologue_Sequences/.
Output from FungiFun v.2 (tests for pathway enrichment) is available in the folder obsData/enrichment in CSV format.
RCC values (competitive index) are available in obsData/RCCdata.txt. The three distinct numbers per species pair represent experimental replicates. We averaged the three replicates per pair for analyses in our study.
Orthologs have ID numbers in the file names that correspond to the Orthogroup/Gene numbers in Tables 1 and S1 in our manuscript. The Gene ID numbers in Table S4 correspond to the IDs in the Penicillium chrysogenum reference genome (taxonomy ID 500485), available at uniprot.org/uniprotkb?query=(taxonomy_id:500485).
Single_Copy_Spec_Labels
PLOTS
Figure 1 is a conceptual figure, we do not provide code to recreate it.
To remake Figure 2, run the code in conceptPlot/DiscordanceNullPower.R from within conceptPlot
To remake Figures 3 and 4, run the code in conceptPlot/NullandEmpRDist.R from within conceptPlot
To remake Figure 5, run the code in conceptPlot/Fig5Updated.R from within conceptPlot
Scripts and python code
Command lines corresponding to each the underlying analyses for these figures are included in scripts/run_disc_sims.sh. To run part of the analysis, uncomment the corresponding line and execute the script as ./run_disc_sims.sh while inside the scripts directory.
The source code for our simulations is stored in src/coalAssoc/simAndInfer.py
Methods
Fungal sequences and interaction strengths were obtained from fungal isolates as described in Louw et al.