Background: When populations evolve under disparate environmental conditions, they experience different selective pressures that shape patterns of sequence evolution and gene expression. These may be manifested in genetic and phenotypic differences such as a diverse immunogenetic repertoire in species from tropical latitudes that have greater and/or different parasite burdens than more temperate species. To test this idea, we compared the transcriptomes of one tropical species (Heteromys desmarestianus) and two species from temperate latitudes (Dipodomys spectabilis and Chaetodipus baileyi) from the Heteromyidae. We did so in a search for positive selection on sequences and/or differential expression, while controlling for phylogenetic history in our choice of species. Results: We identified 127,812 contigs and annotated 34,878 of these, identifying immune genes associated with interleukins, cytokines, and the production of mast cells. We identified 632 genes that were upregulated in H. desmarestianus (8.7% of genes tested) and 492 (6.7%) that were downregulated. Gene ontology terms including “immune response” were associated with 31 (4.9%) of the 632 upregulated genes. We found preliminary evidence for positive selection on three genes (Palmitoyltransferase ZDHHC5 Ubiquitin-conjugating enzyme E2 N, Krueppel-like factor 10, and Spindle and kinetochore-associated protein 1) along the H. desmarestianus lineage. Conclusions: Overall our findings pinpoint genes in species from disparate environments that are on different evolutionary trajectories in terms of expression levels and/or nucleotide sequence. Our data indicate there are significant differences in the expression of genes among the spleen transcriptomes of these species and that a number of these differentially expressed genes do not show the same pattern of differential expression in another tissue type. This points to the possibility of expression differences between these species specific to the spleen transcriptome.
Dipodomys spectabilis spleen transcriptome
Combined assembly of 454 reads and fragmented Illumina assembly (see methods of associated paper) in gsAssembler version 2.6.
D_spec_spleen_filtered.fasta
Chaetodipus baileyi spleen transcriptome
Transcriptome assembly from combined assembly Illumina reads. These reads were assembled first in Trinity and then fragmented and assembled in gsAssembler (see methods of manuscript for more details and rationale)
C_bail_spleen_filtered.fasta
Heteromys desmarestianus spleen transcriptome
Combined assembly of 454 reads and fragmented Illumina assembly (see methods of associated paper) in gsAssembler version 2.6.
H_desm_spleen_filtered.fasta
D_spec_spleen_top_blast_data
These are the preliminary blast descriptions for contigs from the combined assembly of Illumina reads and 454 reads pooled from 4 individual Dipodomys spectabilis spleen libraries. Descriptions are from a BLASTx search of the Swiss-Prot database at e<1x10^-6. The combined assembly was conducted in gsAssembler version 2.6 and contigs that were smaller than 100 bp have been removed. This accounts for the gaps in contig names (e.g. if contig 2 from the gsAssembler asembly was smaller than 100 bp and removed prior to blast annotation then the first two contigs are contig 1 and contig 3).
Heteromys_desmarestianus_spleen_top_blast_data
These are the preliminary blast descriptions for contigs from the combined assembly of Illumina reads and 454 reads pooled from 4 individual Heteromys desmarestianus spleen libraries. Descriptions are from a BLASTx search of the Swiss-Prot database at e<1x10^-6. The combined assembly was conducted in gsAssembler version 2.6 and contigs that were smaller than 100 bp have been removed. This accounts for the gaps in contig names (e.g. if contig 2 from the gsAssembler asembly was smaller than 100 bp and removed prior to blast annotation then the first two contigs are contig 1 and contig 3).
H_desm_spleen_top_blast_data.txt
Chaetodipus_baileyi_spleen_top_blast_data
These are the preliminary blast descriptions for contigs from the combined assembly of Illumina reads pooled from 4 individual Chaetodipus baileyi spleen libraries. Descriptions are from a BLASTx search of the Swiss-Prot database at e<1x10^-6. The combined assembly was conducted in gsAssembler version 2.6 and contigs that were smaller than 100 bp have been removed. This accounts for the gaps in contig names (e.g. contig 2 from the gsAssembler asembly was 25 bp long and removed prior to blast annotation so the first two contigs are contig 1 and contig 3).
C_bail_spleen_top_blast_data.txt
COI_Guide_TREE_Marked_H_desm
This is the tree that was used in codeml as a guide tree for the branch and branch-site tests (see text of methods). The tree is unrooted to as is required for these tests. The tropical lineage, Heteromys desmarestianus has been marked with #1 to indicate it as the foreground branches to be assigned a different rate than the rest of the tree. D_spec = Dipodomys spectabilis, C_bail = Chaetodipus baileyi, H_desm = Heteromys desmarestianus, M_musc = Mus musculus.
perl_script_for_alignments
This is the perl script that was used to produce the multiple sequence alignments necessary for the paper associated with this data package. The script has been uploaded as was used for the analysis and therefore would need to be adapted slightly for custom analyses (this script is designed to take four fasta files and then conduct an msa on a set of 4 sequences, one from each of the four files in order. This could be adjusted to accomodate additional or fewer taxa. The script is commented and questions should be directed to Nicholas Marra regarding it. To replicate the analysis conducted for this paper, the inputs to the script should be four files from a reciprocal best blast analysis that have been trimmed to the start and stop codons of a full cds region. Alignments from the script were further analyzed in codeml of PAML 4.7 to assess selection and alignments that were significant were then visually inspected. See paper for details, note in the paper the 4th taxa was Mus musculus and was necessary for obtaining cds information.
Spleen_R_commands_dryad
This file contains the R commands that were used to run DESeq for the differential expression analysis described in this paper. The program DESeq was introduced in the following paper: Simon Anders and Wolfgang Huber. Differential expression analysis for sequence count data. Genome Biology, 11:R106, 2010. The commands and information about the function of these commands largely derive from the manual and documentation of this program written by Simon Anders that is available through the bioconductor page for DESeq.