Supergene genotypes and associated phenotypes in Formica cinerea samples
Data files
Sep 25, 2023 version files 84.48 KB
-
Overview_of_Formica_cinerea_samples.xlsx
-
README.md
Nov 17, 2023 version files 85.58 KB
-
Overview_of_Formica_cinerea_samples.xlsx
-
README.md
Abstract
Antagonistic selection has long been considered a major driver of the formation and expansion of sex chromosomes. For example, sexually antagonistic variation on an autosome can select for suppressed recombination between that autosome and the sex chromosome, leading to a neo-sex chromosome. Autosomal supergenes, chromosomal regions containing tightly linked variants affecting the same complex trait, share similarities with sex chromosomes, raising the possibility that sex chromosome evolution models can explain the evolution of genome structure and recombination in other contexts. We tested this premise in a Formica ant species wherein we identified four supergene haplotypes on chromosome 3 underlying colony social organization and sex ratio. We discovered a novel rearranged supergene variant (9r) on chromosome 9 underlying queen miniaturization. The 9r is in strong linkage disequilibrium with one chromosome 3 haplotype (P2) found in multi-queen (polygyne) colonies. We suggest that queen miniaturization is strongly disfavored in the single queen (monogyne) background, and thus socially antagonistic. As such, divergent selection experienced by ants living in alternative social ‘environments’ (monogyne and polygyne) may have contributed to the emergence of a genetic polymorphism on chromosome 9 and associated queen-size dimorphism. Consequently, an ancestral polygyne-associated haplotype may have expanded to include the polymorphism on chromosome 9, resulting in a larger region of suppressed recombination spanning two chromosomes. This process is analogous to the formation of neo-sex chromosomes and consistent with models of expanding regions of suppressed recombination. We propose that miniaturized queens, 16-20% smaller than queens without 9r, could be incipient intraspecific social parasites.
README: Overview of Formica cinerea samples
https://doi.org/10.5061/dryad.02v6wwq8s
The table "Overview of Formica cinerea samples" shows information for each of the 1415 individuals analyzed in the paper titled 'Social antagonism facilitates supergene expansion in ants'. In the paper, we describe a supergene on chromosome 3 with 4 alternative haplotypes (MA, MD, P1, P2) underlying the number of colony queens and sex ratio. A second supergene on chromosome 9 with two alternative haplotypes (9a, 9r) controls queen and male body size.
Description of the data and file structure
For each sample we report the specimen ID (SAMPPLE ID coulmn), the caste they belong to (CASTE column), colony of origin (COLONY column), the year they have been collected (YEAR coulmn), the bacth number the samples have been sequenced (BATCH column), the supergene genotype on chromosome 3 (GENOTYPES CHR3 column), the supergene genotype on chromosome 9 (GENOTYPES CHR9 column), head width measurement in millimeters (HW(mm) coulmn), social form (column SOCIAL FORM), and colony sex ratio (SEX RATIO coulumn). Details regarding abreviations and other information useful to better understand the table are reported below:
CASTE: the analyzed samples belong to different castes: worker (W); male (M); gyne, or virgin queen (G); newly mated queen (NQ); and mature queen (MQ).
COLONY: workers, males, gynes, and mature queens were collected from established colonies, while newly mated queens were collected as they sought a suitable place to start their colony. Newly mated queens do not have a colony associated (these are shown as NA).
YEAR: all the samples have been collected across several years, 2014, 2018-2021.
BATCH: In order to have an adequate sample size for all supergene genotypes in all castes, we added data incrementally across years. Differences in extraction protocols and variation among sequencing lanes caused a batch effect. The batch origin is provided for each sample. See also Table S2 and Figure S4 A-B in the main paper.
GENOTYPES CHR3: the supergene on chromosome 3 harbors 4 haplotypes (MA, MD, P1, P2). In our dataset, we detected all the 10 possible genotype combinations. Notice that for each male we report only one haplotype, because they are generally haploid, with rare exceptions (2 out of 387 males are diploid). The genotypes of 4 samples could not be determined (these are shown as NA). See also Figure 1 in the main paper.
GENOTYPES CHR9: the supergene on chromosome 9 harbors 2 haplotypes (9a and 9r). We detected all the 3 possible genotypes. For haploid males, we only reported a single haplotype. The genotypes of 4 samples could not be determined (these are shown as NA). See also Figure 2 in the main paper.
HW(mm): the 9r haplotype on chromosome 9 is associated with queen miniaturization. Queens and gynes with at least one 9r copy are 16-20% smaller than queens without 9r. The 9r males are 8.6% smaller than 9a males. We measured the maximum width across the eyes in 282 gynes and queens and 373 males as a proxy for body size. We did not measure workers nor some gynes and males whose heads were not well preserved (these are shown as NA). See also Figure 4 in the main paper.
COLONY SOCIAL FORM: the haplotypes on chromosome 3 underlie for colony social form, that is whether a colony is monogyne (single-queen) or polygyne (multi-queen). To infer the social form, we selected only colonies with at least 5 diploid individuals and excluded haploid males. We used COANCESTRY 1.0.1.1075 to determine pairwise relatedness using workers and gynes (Wang estimator). We called colonies with all pairwise relatedness estimates ≥ 0.6 as monogyne monandrous, colonies with bimodal distribution of pairwise relationships with at least 40% ≥ 0.6, but none <0.2 as monogyne polyandrous, and colonies with at least one pairwise relationship ≤ 0.1 as polygyne. Colonies with fewer than 5 diploid individuals, from which the social form could not be inferred, are referred to as NA. See also Figure 3 and Figuge S2 in the main paper.
SEX RATIO: while inspecting F. cinerea colonies during sampling, we took note of whether they exhibited a strongly skewed sex ratio, i.e. whether the colony preferentially produced gynes (gyne-producing) or males (male-producing), or both sexes (mixed). We attributed sex ratio to a total of 42 colonies (13 gyne-producing, 23 male-producing, 6 mixed). Colonies for which we do not have sufficient data on gyne and male production are referred to as NA. See also Figure S3 in the main paper.
Sharing/Access information
- Raw Illumina sequencing reads are available at the National Center for Biotechnology Information Short Reads Archive, BioProject PRJNA966702.
Methods
Sample collection
Formica cinerea is a socially polymorphic species with a wide distribution across Europe63. This species nests preferentially along sand and gravel banks of rivers and open sand dunes. We collected F. cinerea workers and alates (gynes and males) from colonies in northern Italy (Aosta Valley and Piedmont) in June-July across several years, 2014, 2018-2021 (Table S1). Whenever possible, we sampled up to 10 gynes and males, and about 15 workers from each colony, and noted the observed sex-ratio. When multiple mature queens were found within colonies, we also sampled a subset of them. During 2019-2021, we collected newly mated wingless queens that were either looking for suitable locations to start new colonies or were under stones in self-dug chambers with none to few eggs. We stored samples in 96-100% ethanol.
Library preparation
We extracted DNA from the head and thorax of workers, and only the head of gynes and males. For the 2014 and 2018-2020 samples, we used the QIAGEN DNeasy Blood & Tissue Kit with modifications described in McGuire et al.44. Briefly, we manually ground the tissue with sterile pestles in a 1.7 ml tube while immersed in liquid nitrogen, and left the pulverized samples overnight in buffer ATL and proteinase K at 56°C. We then used alternatively sourced spin columns (BPI-tech.com), 70% ethanol for DNA wash, and eluted the DNA in 30 µL of buffer EB. We extracted individuals collected in 2021 using the QiaAmp 96 DNA QiaCube HT kit and protocol. We manually ground the ant tissues as described above, and, following the overnight digestion in buffer ATL and proteinase K, we transferred the supernatant to the QIAcube HT/QIAxtractor robot to complete the extraction. We eluted the DNA in 100 µL of buffer EB.
We sequenced all samples using a double-digest restriction site-associated DNA sequencing (RADseq) approach (protocol from Brelsford et al.64). We digested the DNA using restriction enzymes MseI and PstI and ligated a universal MseI adapter and uniquely barcoded PstI adapter to each sample. We then removed small DNA fragments using Serapure magnetic beads65 or Omega magnetic beads (Omega Bio-tek, 2021) in a 0.8:1 ratio (beads: sample solution). We amplified each sample in four separate PCR reactions with indexed Illumina primers and then pooled the replicate PCR products for each sample for a final PCR cycle, with added primer and dNTP. Finally, we pooled all PCR products in a tube and did a final round of small fragment removal using the magnetic beads. Sample sizes in each batch are provided (Table S2).
Bioinformatics
We used Stacks 2.60 to demultiplex our data with default parameters66, PEAR v0.9.1067 to merge paired-end reads and remove adaptor sequences, and BWA-mem268 to align reads to the Formica selysi genome18. We called SNPs using BCFtools mpileup69 and filtered the genotypes for a minimum read depth of 7 (--minDP), a minor allele frequency of 5% (--maf) and excluded indels (--remove-indels) and sites with over 80% missing data (--max-missing) using VCFtools 0.1.16-1870.
Excluding duplicated regions
Ant males are haploid, and this feature provides an opportunity to identify and omit duplicated genomic regions. Males are treated as diploid in our initial pipeline, and loci that are heterozygous in at least 5% of males are flagged for removal from the complete dataset, because these reflect variable sequences in duplicated regions instead of alternative alleles in a single region of the genome.
Mitigating the batch effect
In order to have an adequate sample size for all supergene genotypes in all castes (particularly gynes and males, which are sampled opportunistically), we added data incrementally across years. Differences in extraction protocols and variation among sequencing lanes caused a batch effect (Figure. S4 A). To mitigate this issue, we calculated the Weir and Cockerham's FST between batch pairs at each locus. We then removed all SNPs showing FST values ≥ 0.3 in the comparison of at least one pair of batches (because the geographic scope of sampling was similar across years, we would not expect to find true changes in allele frequency of this magnitude) (Figure S4 B). Our final dataset resulted in 15129 SNPs and 1415 individuals (Table S3). Workers, gynes, males and mature queens were collected from 172 colonies, and 95 newly mated queens were collected as they sought a suitable place to start their colony.
To assess whether polygyne Formica cinerea alates (gynes, queens and males) exhibit the reduction in size typical of polygyny syndrome46, we measured the maximum width across the eyes in 282 gynes and queens and 373 males using a Leica DMC2900 camera mounted on a Leica S8APO at 25× magnification. We used head width because it is known to have a strong positive correlation with several body segment dimensions in Formica species29,80, and thus serves as a good proxy for body size within caste.
Sex ratio
While inspecting F. cinerea colonies during sampling, we took note of whether they exhibited a strongly skewed sex ratio, i.e. whether the colony preferentially produced gynes or males, or both sexes. We attributed the sex ratio to colonies observed with at least seven alates. Gyne-producing colonies had at least seven gynes and no more than two males, male-producing colonies had at least seven males and no more than two gynes, and mixed colonies were intermediate between the two. In total, 23 F. cinerea colonies were male-producing, 13 gyne-producing, and 6 were mixed.