Comparative genomics of sex-determination-related genes reveals shared evolutionary patterns between bivalves and mammals, but not fruit flies
Data files
Oct 03, 2025 version files 1.74 GB
-
bivalve_results.tar.gz
987.44 MB
-
drosophila_results.tar.gz
200.08 MB
-
mammal_results.tar.gz
548.89 MB
-
README.md
5.86 KB
Abstract
The molecular basis of sex determination (SD), while being extensively studied in model organisms, remains poorly understood in many animal groups. Bivalves, a diverse class of molluscs with a variety of reproductive modes, represent an ideal yet challenging clade for investigating SD and the evolution of sexual systems. However, the absence of a comprehensive framework has limited progress in this field, particularly regarding the study of sex-determination related genes (SRGs). In this study, we performed a genome-wide sequence evolutionary analysis of the Dmrt, Sox, and Fox gene families in more than 40 bivalve species. For the first time, we provide an extensive and phylogenetic-aware dataset of these SRGs, and we find support to the hypothesis that Dmrt-1L and Sox-H may act as primary sex-determining genes, by showing their high levels of sequence diversity within the bivalve genomic context. To validate our findings, we studied the same gene families in two well-characterized systems, mammals and fruit flies (genus Drosophila). In the former, we found that the male sex-determining gene Sry exhibits a pattern of amino acid sequence diversity similar to that of Dmrt-1L and Sox-H in bivalves, consistent with its role of master SD regulator. In contrast, no such pattern was observed among genes of the fruit fly SD cascade, which is controlled by a chromosomic mechanism. Overall, our findings highlight similarities in the sequence evolution of some mammal and bivalve SRGs, possibly driven by a comparable architecture of SD cascades. This work underscores once agaithe importance of employing a comparative approach when investigating understudied and non-model systems.
Dataset DOI: 10.5061/dryad.0cfxpnwfj
In this repository you will find the files of main results generated by the analyses in the research paper:
Nicolini F, Nuzhdin S, Ghiselli F, Luchetti A, Milani L. Comparative genomics of sex-determination-related genes reveals shared evolutionary patterns between bivalves and mammals, but not fruit flies.
Visit our research group website, EVO·COM!
What you can find here
This repository contains three different directories, one per each analysed dataset:
- in
bivalve_results.tar.gzyou can find results for the dataset of bivalve genomes and transcriptomes; - in
mammal_results.tar.gzyou can find results for the dataset of mammal reference genomes; - in
drosophila_results.tar.gzyou can find results for the dataset of fruit fly reference genomes.
Detailed file descriptions
File: bivalve_results.tar.gz
Description: this directory contains result files for the bivalve dataset:
01_FINAL_dataset/contains the amino acid and nucleotide sequences of annotated genes from the bivalve dataset, after their processing;02_SRG_sequences_phylotree/contains the sequences and the ML phylogenetic trees of annotated Dmrt, Sox, and Fox genes in bivalves;03_possvm_orthology/contains the result of the possvm orthology inference in Dmrt, Sox, and Fox genes;04_OrthoFinder_orthogroups/contains some of the major result files of OrthoFinder;05_decomposed_orthogroups/contains the results of the orthogroup decomposition (with the list of the obtained decomposed orthogroups and their genes), as well as the annotation of Dmrt, Sox, and Fox decomposed orthogroups;06_SRGs_occurence/contains the presence/absence matrix of Dmrt, Sox, and Fox genes in the analysed species, as well as the annotation of possvm orthogroups;07_distribution_divergence/contains the median values of the amino acid sequence divergence per decomposed orthogroup, as well as the amino acid substitution models and other metadata;08_additional_trees/contains additional ML phylogenetic trees of Dmrt, Sox, and Fox genes, inferred to better establish the identity of certain groups;09_GO_enrichment/contains the GO-enrichment results, divided per category (Biological Process, Cellular Component, or Molecular Function) and per enrichment method (classic or elim, as implemented intopGO);10_selection_analyses/contains the results of selection analyses for both RELAX and BUSTED; files are in the json format, so that they can be uploaded and viewed on HyPhy Vision.
File: mammal_results.tar.gz
Description: this directory contains result files for the mammal dataset:
01_FINAL_dataset/contains the amino acid and nucleotide sequences of annotated genes from the mammal dataset, after their processing;02_SRG_sequences_phylotree/contains the sequences and the ML phylogenetic trees of annotated Dmrt, Sox, and Fox genes in mammals;03_possvm_orthology/contains the result of the possvm orthology inference in Dmrt, Sox, and Fox genes;04_OrthoFinder_orthogroups/contains some of the major result files of OrthoFinder;05_decomposed_orthogroups/contains the results of the orthogroup decomposition (with the list of the obtained decomposed orthogroups and their genes), as well as the annotation of Dmrt, Sox, and Fox decomposed orthogroups;06_SRGs_occurence/contains the presence/absence matrix of Dmrt, Sox, and Fox genes in the analysed species, as well as the annotation of possvm orthogroups;07_distribution_divergence/contains the median values of the amino acid sequence divergence per decomposed orthogroup, as well as the amino acid substitution models and other metadata;08_GO_enrichment/contains the GO-enrichment results, divided per category (Biological Process, Cellular Component, or Molecular Function) and per enrichment method (classic or elim, as implemented intopGO);
File: drosophila_results.tar.gz
Description: this directory contains result files for the fruit fly dataset:
01_FINAL_dataset/contains the amino acid and nucleotide sequences of annotated genes from the fruit fly dataset, after their processing;02_SRG_sequences_phylotree/contains the sequences and the ML phylogenetic trees of annotated Dmrt, Sox, and Fox genes in fruit flies;03_possvm_orthology/contains the result of the possvm orthology inference in Dmrt, Sox, and Fox genes;04_OrthoFinder_orthogroups/contains some of the major result files of OrthoFinder;05_decomposed_orthogroups/contains the results of the orthogroup decomposition (with the list of the obtained decomposed orthogroups and their genes), as well as the annotation of Dmrt, Sox, and Fox decomposed orthogroups;06_SRGs_occurence/contains the presence/absence matrix of Dmrt, Sox, and Fox genes in the analysed species, as well as the annotation of possvm orthogroups;07_distribution_divergence/contains the median values of the amino acid sequence divergence per decomposed orthogroup, as well as the amino acid substitution models and other metadata;08_GO_enrichment/contains the GO-enrichment results, divided per category (Biological Process, Cellular Component, or Molecular Function) and per enrichment method (classic or elim, as implemented intopGO).
Code
Documented code and data can be accessed in the GitHub repository: github.com/filonico/bivalvia_SRGs
