Patterns of gene family evolution and selection across Daphnia
Data files
Sep 19, 2025 version files 1.40 MB
-
CAFE_k3p_Gamma_report.cafe
1.35 MB
-
OG0002427_hsc.trans.clip.fa
4.94 KB
-
OG0003195_oxoglutarate.trans.clip.fa
11.68 KB
-
OG0005474_centrosomal.trans.clip.fa
24.14 KB
-
OG0007254_defense.trans.clip.fa
3.66 KB
-
OG0009038_glycoprotein.trans.clip.fa
2.83 KB
-
README.md
2.32 KB
Abstract
Gene family expansion underlies a host of biological innovations across the tree of life. Understanding why specific gene families expand or contract requires comparative genomic investigations, clarifying further how species adapt in the wild. This study investigates the gene family change dynamics within several species of Daphnia, a group of freshwater microcrustaceans that are useful model systems for evolutionary genetics. We employ comparative genomics approaches to understand the forces driving gene evolution and draw upon candidate gene families that change gene numbers across Daphnia. Our results suggest that genes related to stress responses and glycoproteins generally expand across taxa, and we investigate evolutionary hypotheses of adaptation that may underpin expansions. Through these analyses, we shed light on the interplay between gene expansions and selection within other ecologically relevant stress response gene families. While we show generalities in gene family turnover in genes related to stress response (i.e., DNA repair mechanisms), most gene family evolution is driven in a species-specific manner. Additionally, while we show general trends towards positive selection within some expanding gene families, many genes are not undergoing selection, highlighting the complex nature of diversification and evolution within Daphnia. Our research enhances the understanding of individual gene family evolution within Daphnia and provides a case study of ecologically relevant genes prone to change.
Dataset DOI: 10.5061/dryad.gqnk98t02
Description of the data and file structure
CAFE5 report of the significant gene family evolution:
- CAFE_k3p_Gamma_report.cafe
Aligned protein family codons for recreating figures:
- OG0002427_hsc.trans.clip.fa
- This fasta contains the sequences for the "hsc70-interacting" protein family.
- OG0003195_oxoglutarate.trans.clip.fa
- This fasta contains the sequences for the "H2-oxoglutarate and iron-dependent oxygenase JMJD4-like" protein family.
- OG0005474_centrosomal.trans.clip.fa
- This fasta contains the sequences for the "centrosomal protein of 164 kDa-like" protein family.
- OG0007254_defense.trans.clip.fa
- This fasta contains the sequences for the "putative defense protein 3" protein family.
- OG0009038_glycoprotein.trans.clip.fa
- This fasta contains the sequences for the "Glycoprotein-N-acetylgalactosamine 3- beta-galactosyltransferase" protein family.
Code/software
See the methodology section of the associated article.
Access information
Daphnia whole-genome dataset. Chromosome and scaffold-level assemblies of seven species from the genus *Daphnia *were collected from the NCBI Genome search engine (https://www.ncbi.nlm.nih.gov/datasets/genome/) accessed in January 2025(Kitts et al., 2016). We chose North American D. pulex (KAP4; RefSeq: GCF_021134715.1), European D. pulex (D84A; GenBank: GCA_023526725; Barnard-Kubow et al., 2022), North American D. pulicaria (RefSeq: GCF_021234035.1; Wersebe et al., 2023), D. sinensis (GenBank: GCA_013167095.2, Jia et al., 2022), D. carinata (RefSeq: GCF_022539665.2), D. galeata (GenBank: GCA_030770115.1; Nickel et al., 2021), and D. magna (RefSeq: GCF_020631705.1) for analyses because they are the most complete species representatives and were annotated for protein-coding genes. and these genomes were the highest quality and newest available for each unique species. We also included Artemia franciscana (i.e., brine shrimp; GenBank: GCF_032884065.1) and Penaeus monodon (i.e, black tiger shrimp; GenBank: GCF_015228065.2) as Crustacean outgroups for our tree-based analyses.
