Comparative transcriptomics and phylostratigraphy of Argentine ant odorant receptors
Data files
May 24, 2024 version files 473.48 MB
-
Collated_Proteome_FASTA.fasta
451.86 MB
-
OR_Phylostratigraphy_Final.xlsx
33.62 KB
-
Phylostratigraphy_Final.xlsx
21.57 MB
-
Phylostratigraphy_Node_Distance.csv
738 B
-
README.md
3.63 KB
-
Table_S1.xlsx
10.20 KB
Abstract
Nestmate recognition in ants is regulated through the detection of cuticular hydrocarbons by odorant receptors (ORs) in the antennae. These ORs are crucial for maintaining colony cohesion that allows invasive ant species to dominate colonized environments. In the invasive Argentine ant, Linepithema humile, ORs regulating nestmate recognition are thought to be present in a clade of nine exon odorant receptors, but the identity of the specific genes remains unknown. We sought to narrow down the list of candidate genes using transcriptomics and phylostratigraphy. Comparative transcriptomic analyses were conducted on the antennae, head, thorax, and legs of Argentine ant workers. We have identified a set of twenty-one nine-exon odorant receptors enriched in the antennae compared to the other tissues, allowing for downstream verification of whether they can detect Argentine ant cuticular hydrocarbons. Further investigation of these ORs could allow us to further understand the mechanisms underlying nestmate recognition and colony cohesion in ants.
README: Comparative transcriptomics and phylostratigraphy of Argentine ant odorant receptors
https://doi.org/10.5061/dryad.83bk3jb1d
Contains phylostratigraphy analysis results for PLoS ONE submission. Data were generated by comparing the L. humile proteome to thirty proteomes from L. humile and other ant and insect species via BLASTp analysis (detailed in "Table S1.xlsx"). Using these data, we determined that odorant receptor proteins are more likely to show more recent origins in evolutionary history than the L. humile proteome in general. However, of the 338 taxonomically-restricted proteins we found, none were odorant receptors.
Description of the data and file structure
Multipage Excel spreadsheet showing the BLAST results and phylostratigraphic distances of each protein in the L. humile proteome. Distance is measured in "nodes", which is the amount of nodes separating L. humile and the subject species in a phylogenetic tree.
Each page contains the following columns:
qseqid - The query protein used in the BLAST search
sseqid - The subject protein found to show sequence similarity to the query protein
% identity - Percent of amino acids that are identical between the aligned query and subject sequences.
evalue - Quality check on the alignment quality. The number of expected hits of similar quality that could be found by chance
bitscore - Quality check on alignment quality. The required size of a sequence database to find the current match by chance (2^bitscore)
Subject Species - Species name associated with the sseqid
Node Distance - Amount of nodes separating L.humile and Subject Species in phylogenetic space.
Multipage Excel spreadsheet similar to Phylostratigraphy Final.xlsx, but results are for OR protein sequences from Smith et al. 2011
Each page contains the following columns:
qseqid - The query protein used in the BLAST search
sseqid - The subject protein found to show sequence similarity to the query protein
% identity - Percent of amino acids that are identical between the aligned query and subject sequences.
evalue - Quality check on the alignment quality. The number of expected hits of similar quality that could be found by chance
bitscore - Quality check on alignment quality. The required size of a sequence database to find the current match by chance (2^bitscore)
Subject Species - Species name associated with the sseqid
Node Distance - Amount of nodes separating L.humile and Subject Species in phylogenetic space.
CSV spreadsheet with a list of all species used in phylostratigraphy analysis. The spreadsheet contains the following columns:
Subject Species - Species name associated with the sseqid
Node Distance - Amount of nodes separating L.humile and Subject Species in phylogenetic space.
FASTA file used in BLASTp analysis. Contains protein FASTA information for the thirty proteomes used in the analysis.
Excel spreadsheet with a list of the species used in the phylostratigraphy analysis. The spreadsheet contains the following columns:
Species Name - Name of the species
Accession - NCBI Assembly Accession # for the proteome data for each species.
Sharing/Access information
Transcriptomic data generated for this manuscript were deposited in the NCBI SRA (https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA1006049)