Rising ocean temperatures associated with global climate change induce breakdown of the symbiosis between coelenterates and photosynthetic microalgae of the genus Symbiodinium. Association with more thermotolerant partners could contribute to resilience, but the genetic mechanisms controlling specificity of hosts for particular Symbiodinium types are poorly known. Here we characterize wild populations of a sea anemone laboratory model system for anthozoan symbiosis, from contrasting environments in Caribbean Panama. Patterns of anemone abundance and symbiont diversity were consistent with specialization of holobionts for particular habitats, with Exaiptasia pallida/S. minutum (ITS2 type B1) abundant on vertical substrate in thermally stable, shaded environments but E. brasiliensis/Symbiodinium sp. (ITS2 clade A) more common in shallow areas subject to high temperature and irradiance. Population genomic sequencing revealed a novel E. pallida population from the Bocas del Toro Archipelago that only harbors S. minutum. Loci most strongly associated with divergence of the Bocas-specific population were enriched in genes with putative roles in cnidarian symbiosis, including activators of the complement pathway of the innate immune system, thrombospondin-type-1 repeat domain proteins, and coordinators of endocytic recycling. Our findings underscore the importance of unmasking cryptic diversity in natural populations and the role of long-term evolutionary history in mediating interactions with Symbiodinium.
Sample_collections
This file contains GPS coordinates (LAT/LON), species assignment codes (E. brasiliensis [Ebra] or E. pallida [Epal]), the dominant clade of Symbiodinium hosted by each anemone (B1, B2 or A), and other meta-information for each sample that was genotyped. E. pallida samples are identified as representatives of the global or Bocas-specific populations (or admixed if admixture proportions >35%). If the sample was identified as a clone of one or more other samples in the dataset, this is indicated in the CLONE column by a corresponding MLG identifier (e.g. "MLG5").
Abundance
Contains files necessary to recreate analyses of anemone abundance. Bocas_transects_individual_roots.csv contains abundance data for the three sites within the Bocas del Toro archipelago, where anemones were counted on mangrove roots from 30-m transects. Each row contains the number of anemones per root ("Abundance_root") for a different mangrove root in a particular 30-m transect ("Transect"). The collection site (Isla Colon/Cayo Roldan/Cayo Agua) is given in the "Population" column, as well as the GPS coordinates, taken at the start of the transect ("Lat"/"Lon"). The distance into the transect in meters is given in the "Distance_m" column.
distances_bw_transects.csv: This file contains the shortest distance in meters (e.g. as the crow flies) between the start of one transect to the start of another, for transects that were conducted in continuous surveys.
SymbiodiniumDensity
Protein.csv contains absorbance values from Pierce BCA assay determination of total protein (2 replicates per sample). Columns "Abs1" and "Abs2" are the raw absorbance values. "Vol_mL" is the total volume of the supernatant. Total protein is given in both micrograms ("TotalProtein_ug") and mg ("mg"). The date of the assay is given in the "Date" column, and different batches are indicated by the capital letters appended to the date. Algal_Counts.csv contains cell count/mitotic index data from hemocytometer counts. The R script algae.R uses these files to calculate algal cell density within each anemone. Note that we collected more anemones than we genotyped via 2bRAD, so these files contain data for more samples than are present in the Sample_collections.txt file.
PopulationStructure
apal.5x.nc2.mac4.vcf contains vcf file of genotypes for all E. pallida individuals after clone removal, but before filtering for loci with a minimum allele frequency of 5%.
pcadapt_analysis_final.R contains code to perform PCA analysis with pcadapt. This analysis is based on a SNP matrix (SNPmat0.05.txt) of 2577 SNPs that have a minimum allele frequency of 5%. The names of the loci are in locusNames_0.05.txt and information about the population designation for each sample is in pop_file.csv.
Gene set enrichment analyses in ErmineJ used the Gene Annotation file bvpSNPS_erminej.txt and were based on the PC1 loadings in column 3 of PCloadings_final.txt. The file bvpSNPS_erminej.txt contains the AIPGENE models for all SNPs in apal.5x.nc2.mac4.vcf that were present in genic regions (5'UTR, CDS, intron or 3'UTR) as well as GO annotations from the Aiptasia genome v1.0.
out.weir.fst is a file containing Weir and Cockerham FST values output from vcftools for E. pallida from the global lineage vs. the Bocas specific lineage. This was based on the 2577 biallelic SNPs with minimum allele frequency of 5%.
Sequencing
Raw reads generated for 2bRAD-Seq libraries of 155 samples. Paired-end reads are available under NCBI BioProject accession PRJNA394157. Only R1 (forward) reads were used for analysis in the manuscript.