Supporting data for: Gene-rich UV sex chromosomes harbor conserved regulators of sexual development (Carey et al., 2021)
Data files
Sep 10, 2021 version files 3.60 GB
-
all_cds.fa.gz
-
all_pep_files_for_orthofinder.tar.gz
-
all_pep.fa.gz
-
liverwort_trinity_assemblies.tar.gz
-
moss_trinity_assemblies.tar.gz
-
Orthogroups.txt
-
sexlinked_liverwort_alignments.tar.gz
-
sexlinked_liverwort_trees.tar.gz
-
sexlinked_moss_alignments.tar.gz
-
sexlinked_moss_trees.tar.gz
Abstract
Non-recombining sex chromosomes, like the mammalian Y, often lose genes and accumulate transposable elements, a process termed degeneration. The correlation between suppressed recombination and degeneration is clear in animal XY systems, but the absence of recombination is confounded with other asymmetries between the X and Y. In contrast, UV sex chromosomes, like those found in bryophytes, experience symmetrical population genetic conditions. Here we generate and use nearly gapless female and male chromosome-scale reference genomes of the moss Ceratodon purpureus to test for degeneration in the bryophyte UV sex chromosome system. We show the moss sex chromosomes evolved over 300 million years ago and expanded via two chromosomal fusions. Although the sex chromosomes show signs of weaker purifying selection than autosomes, we find suppressed recombination alone is insufficient to drive gene loss on sex-specific chromosomes. Instead, the U and V sex chromosomes harbor thousands of broadly-expressed genes, including numerous key regulators of sexual development across land plants.
Methods
All methods can be found in the Material and Methods or Supplementary Materials and Methods sections of Carey et al., 2020.
Usage notes
- novaseq_FASTQ_de_interlacer.pl -- splits paired-end Illumina NovaSeq data into forward and reverse files
- liverwort_trinity_assemblies.tar.gz -- contains all de novo Trinity assemblies for liverworts used in this study
- moss_trinity_assemblies.tar.gz -- contains all de novo Trinity assemblies for mosses used in this study
- all_pep_files_for_orthofinder.tar.gz -- all peptide files for all species used in the OrthoFinder run in this study
- Orthogroups.txt - all orthogroups identified by OrthoFinder clustering
- orthogroup_filter.pl -- perl script to filter orthogroups ("clusters") output by OrthoFinder for a minimum number of species
- all_cds.fa.gz and all_pep.fa.gz -- fasta files containing all cds and peptides, respectively, for all species combined to write fasta files for each Orthofinder gene cluster
- fasta_from_OrthoFinder.pl -- perl script to write a separate fasta file for each Orthogroup ("cluster") output by OrthoFinder
- alignment_length_filter.pl -- perl script to filter fasta files by a user input minimum number of nucleotides or amino acids
- sexlinked_liverwort_alignments.tar.gz -- final, filtered cds alignments used to build gene trees of sex-linked genes in Marchantia polymorpha
- sexlinked_moss_alignments.tar.gz -- final, filtered cds alignments used to build gene trees of sex-linked genes in Ceratodon purpureus
- sexlinked_liverwort_trees.tar.gz -- RAxML gene trees with bootstrap support of sex-linked genes in Marchantia polymorpha
- sexlinked_moss_trees.tar.gz -- RAxML gene trees with bootstrap support of sex-linked genes in Ceratodon purpureus
- edlwtre2.pl -- perl script that roots gene trees and reduces isoforms of the same sample (within a clade) down to the longest isoform
- physco_outgroup.py -- python script that uses ETE3 to identify C. purpureus sex-linked genes and the closest Physcomitrium patens outgroup
- prune_tree.py -- python script that uses ETE3 to identify C. purpureus sex-linked genes and prune at the closest Physcomitrium patens outgroup. The script also randomly selects one isoform/homolog for each other species in the tree
- array_hash_extractor_fasta_unlock_tree_mod.pl -- perl script that filters the original fasta file for those left after prune_tree.py
- paml_header_prep.pl -- perl script for prepping the headers in gene trees and fasta files for PAML
- paml_tree_prep.pl -- perl script for generating different labeled trees for the sex-linked genes evolving differently than autosomes for PAML
- paml_bash.sh -- bash script to run PAML on multiple genes and report the results of dN, dS, and dN/dS for C. purpureus sex-linked genes
- paml_AIC.pl -- perl script necessary to run PAML in paml_bash.sh
- array_hash_extractor_fasta_unlock_ks.pl -- perl script that searches for a user identified list of C. purpureus one-to-one orthologous UV genes across multiple alignments. The output is an individual alignment for each of the U and V-linked orthologous genes
- aln_to_axt.pl -- perl script that converts an alignment of one-to-one UV genes into axt format for KaKs Calculator
- ceratodon_genome_plots.R -- R script for generating gene tree plots, density plots, Ks on UV chromosome plot, codon metrics and dN/dS plots, and gene expression heatmaps