Skip to main content

Variation at an adhesin locus suggests sociality in natural populations of the yeast Saccharomyces cerevisiae

Cite this dataset

Murphy, Helen; Oppler, Zachary; Parrish, Meadow (2019). Variation at an adhesin locus suggests sociality in natural populations of the yeast Saccharomyces cerevisiae [Dataset]. Dryad.


Microbes engage in numerous social behaviors that are critical for survival and reproduction, and that require individuals to act as a collective. Various mechanisms ensure that collectives are composed of related, cooperating cells, thus allowing for the evolution and stability of these traits, and for selection to favor traits beneficial to the collective. Since microbes are difficult to observe directly, sociality in natural populations can instead be investigated using evolutionary genetic signatures, as social loci can be evolutionary hotspots. The budding yeast has been studied for over a century, yet little is known about its social behavior in nature. Flo11 is a highly regulated cell adhesin required for most lab social phenotypes; studies suggest it may function in cell recognition and its heterogenous expression may be adaptive for collectives such as biofilms. We investigated this locus and found positive selection in the areas implicated in cell-cell interaction, suggesting selection for kin discrimination. We also found balancing selection at an upstream activation site, suggesting selection on the level of variegated gene expression. Our results suggest this model yeast is surprisingly social in natural environments and is likely engaging in various forms of sociality. By utilizing genomic data, this research provides a glimpse of otherwise unobservable interactions.

Usage notes

FLO11 Sequence Data, Length Variation, and Social Scores for Saccharomyces cerevisiae:

These files contain curated sequence data for the FLO11 locus from the 78 environmental isolates of the budding yeast, Saccharomyces cerevisiae. The locus was sequenced using next gen sequencing technology and processed to create de novo assemblies. The fasta files contain the infered alleles and separate the data in the following way: (1) the upstream regulatory region, (2) the downstream regulatory region, (3) the A-domain alone, (4) the C-domain alone, and (5) the A-part of B-C domains concatenated. The csv file contains the length of the B domain (as determined by fragment analysis), the allele identity of the regulatory region, and raw social/biofilm scores.