Sex chromosome turnover in African annual killifishes of the genus Nothobranchius
Data files
Aug 11, 2025 version files 8.81 GB
-
female_poolseq_amhr2_reads_sorted.bam
206.64 KB
-
FTHI_NFU_ragoo.fasta
983.38 MB
-
Fundulosoma-thierryi-v1.0.a1.654296049b4ef-repeatmasker.repeats.fna
226.52 MB
-
Fundulosoma-thierryi-v1.0.a1.654296049b4ef-repeatmasker.repeats.gff3
142.83 MB
-
Fundulosoma-thierryi-v1.0.a1.65663ba64fbb4-repeatmodeler.repeats.fna
1.09 GB
-
Fundulosoma-thierryi-v1.0.a1.65663ba64fbb4-repeatmodeler.repeats.gff3
503.17 MB
-
Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.CDS.fna
49.82 MB
-
Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.genes.fna
456.12 MB
-
Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.genes.gff3
72.03 MB
-
Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.protein.faa
21.10 MB
-
gdf6_female_NTHI_poolseq_reads.bam
269.23 KB
-
GDF6_female_NTHI.vcf
9.94 KB
-
gdf6_male_NTHI_poolseq_reads.bam
406.95 KB
-
GDF6_male_NTHI.vcf
22.17 KB
-
hsd17b_female_NTHI_poolseq_reads.bam
289.24 KB
-
HSD17b_female_NTHI.vcf
9.09 KB
-
hsd17b_male_NTHI_poolseq_reads.bam
422.65 KB
-
HSD17b_male_NTHI.vcf
9.17 KB
-
male_poolseq_amhr2_reads_sorted.bam
220.27 KB
-
Ngu_AMHR2_XY_alleles.fas
14.78 KB
-
NGU_female_poolseq_amhr2_extended.vcf
8.69 KB
-
NGU_male_poolseq_amhr2_extended_curated.vcf
14.66 KB
-
NGU_male_poolseq_amhr2_extended.vcf
14.67 KB
-
NGU_NFU_ragoo.fasta
939.69 MB
-
Ngue_Flye_run2-2.8_assembly.fasta
955.25 MB
-
Nothobranchius-guenterii-v1.0.a1.654b88373daeb-repeatmodeler.repeats.fna
984.48 MB
-
Nothobranchius-guenterii-v1.0.a1.654b88373daeb-repeatmodeler.repeats.gff3
432.81 MB
-
Nothobranchius-guenterii-v1.0.a1.654b884662ddd-repeatmasker.repeats.fna
185.49 MB
-
Nothobranchius-guenterii-v1.0.a1.654b884662ddd-repeatmasker.repeats.gff3
123.05 MB
-
Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.CDS.fna
80.70 MB
-
Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.genes.fna
410.66 MB
-
Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.genes.gff3
125.91 MB
-
Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.protein.faa
38.96 MB
-
Nthi_Corr-Uncor_Strict_Meta_Min5000_assembly_Medaka_Long_Next_Short_Purged.fasta
983.26 MB
-
Nthi_GDF6_XY_alleles.fas
11.96 KB
-
README.md
5.88 KB
Abstract
Sex chromosomes of teleost fishes undergo frequent turnovers. Annual Nothobranchius killifishes provide a suitable system to study sex chromosome turnover as they comprise the XY sex chromosome system in the model turquoise killifish, N. furzeri, and X1X2Y multiple sex chromosomes in six other representatives scattered across the Nothobranchius phylogeny, nested within species without cytologically detectable sex chromosomes. We combined molecular cytogenetics and genomic analyses to examine the X1X2Y systems in four Nothobranchius spp. and their outgroup Fundulosoma thierryi. Fluorescence in situ hybridization with painting probes specific for three sex chromosome systems and N. furzeri bacterial artificial chromosomes (BAC) bearing orthologues of eight genes repeatedly co-opted as master sex determining (MSD) genes in fishes suggests at least four independent origins of sex chromosomes in the genus Nothobranchius. The synteny block carrying the amhr2 gene was shared by X1X2Y systems of N. brieni, N. guentheri, and N. lourensi, which, however, differ by their fused autosomes. The gdf6 gene is sex-linked in F. thierryi. None of the mapped MSD gene candidates was sex-linked in N. ditte. We further sequenced F. thierryi and N. guentheri genomes and performed analyses of male and female Pool-seq and coverage data to determine their non-recombining regions and their differentiation. Level of sex chromosome differentiation was low in F. thierryi, but we identified two distinct sex-linked evolutionary strata in N. guentheri. While the amhr2 gene represents a candidate for MSD in N. guentheri, its localization in the younger stratum and low allelic variation questions its role in sex determination in a common ancestor of N. brieni, N. guentheri, and N. lourensi. Recombination cold spots such as fusion breakpoints could have contributed to formation of sex chromosome evolutionary strata.
README for the Dryad repository: https://doi.org/10.1111/mec.70029
Study:
Hospodářská M, Mora P, Chung Voleníková A, Al-Rikabi A, Altmanová M, Simanovsky SA, Tolar N, Pavlica T, Janečková K, Štundlová J, Bobryshava K, Jankásek M, Hiřman M, Liehr T, Reichard M, Krysanov EY, Ráb P, Englert C, Nguyen P, Sember A (2025) Multiple Origins of Sex Chromosomes in Nothobranchius Killifishes. Molecular Ecology, 34:e70029. https://doi.org/10.1111/mec.70029
Last updated: August 11, 2025
Data provided
Fundulosoma thierryi
Genome assembly
Nthi_Corr-Uncor_Strict_Meta_Min5000_assembly_Medaka_Long_Next_Short_Purged.fasta: FASTA format of the Fundulosoma thierryi female genome assembly.FTHI_NFU_ragoo.fasta: FASTA of Fundulosoma thierryi pseudochromosomes scaffolded by RaGOO using Nothobranchius furzeri as reference.
Gene annotation
Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.CDS.fna: Nucleotide sequences of all CDS features annotated on the Fundulosoma thierryi assembly.Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.genes.fna: Gene models annotated on the Fundulosoma thierryi assembly.Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.genes.gff3: Annotation of genomic features (GFF3) in the Fundulosoma thierryi genome assembly (BRAKER v2.1.5 via GenSAS v6.0).Fundulosoma-thierryi-v1.0.a1.65cb8ad34d6d8-publish.protein.faa: FASTA sequences of protein products annotated on the Fundulosoma thierryi assembly.
Repeat annotation
Fundulosoma-thierryi-v1.0.a1.65663ba64fbb4-repeatmodeler.repeats.gff3: Generic Feature Format Version 3 (GFF3) of transposable elements annotated in the F. thierryi genome.Fundulosoma-thierryi-v1.0.a1.65663ba64fbb4-repeatmodeler.repeats.fna: FASTA format of the annotated F. thierryi transposable elements.Fundulosoma-thierryi-v1.0.a1.654296049b4ef-repeatmasker.repeats.gff3: Generic Feature Format Version 3 (GFF3) of tandem repeats annotated in the F. thierryi genome.Fundulosoma-thierryi-v1.0.a1.654296049b4ef-repeatmasker.repeats.fna: FASTA format of the annotated F. thierryi tandem repeats.
Gene-specific data
Nthi_GDF6_XY_alleles.fas: Manually curated sequences of X- and Y-linked alleles of the Fundulosoma thierryi gdf6 gene.gdf6_female_NTHI_poolseq_reads.bam: gdf6 gene alignment file of female Pool-seq data (F. thierryi) mapped to pseudochromosomes (RaGOO, N. furzeri reference).gdf6_male_NTHI_poolseq_reads.bam: gdf6 gene alignment file of male Pool-seq data (F. thierryi) mapped to pseudochromosomes (RaGOO, N. furzeri reference).hsd17b_female_NTHI_poolseq_reads.bam: hsd17b gene alignment file of female Pool-seq data (F. thierryi) mapped similarly.hsd17b_male_NTHI_poolseq_reads.bam: hsd17b gene alignment file of male Pool-seq data (F. thierryi) mapped similarly.GDF6_female_NTHI.vcf: Fundulosoma thierryi gdf6 female variant calls (freebayes).GDF6_male_NTHI.vcf: Fundulosoma thierryi gdf6 male variant calls (freebayes).HSD17b_female_NTHI.vcf: Fundulosoma thierryi hsd17b female variant calls (freebayes).HSD17b_male_NTHI.vcf: Fundulosoma thierryi hsd17b male variant calls (freebayes).
Nothobranchius guentheri
Genome assembly
Ngue_Flye_run2-2.8_assembly.fasta: FASTA of the Nothobranchius guentheri male genome assembly.NGU_NFU_ragoo.fasta: Nothobranchius guentheri pseudochromosomes scaffolded by RaGOO using Nothobranchius furzeri as reference.
Gene annotation
Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.CDS.fna: Nucleotide sequences of all CDS features annotated on the Nothobranchius guentheri assembly.Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.genes.fna: Gene models annotated on the Nothobranchius guentheri assembly.Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.genes.gff3: Annotation of genomic features (GFF3) in the Nothobranchius guentheri genome assembly (BRAKER v2.1.5 via GenSAS v6.0).Nothobranchius-guentheri-v1.0.a1.66266468babb3-publish.protein.faa: FASTA sequences of protein products annotated on the Nothobranchius guentheri assembly.
Repeat annotation
Nothobranchius-guenterii-v1.0.a1.654b88373daeb-repeatmodeler.repeats.gff3: Generic Feature Format Version 3 (GFF3) of transposable elements annotated in the N. guentheri genome.Nothobranchius-guenterii-v1.0.a1.654b88373daeb-repeatmodeler.repeats.fna: FASTA format of the annotated N. guentheri transposable elements.Nothobranchius-guenterii-v1.0.a1.654b884662ddd-repeatmasker.repeats.fna: FASTA format of the annotated N. guentheri tandem repeats.Nothobranchius-guenterii-v1.0.a1.654b884662ddd-repeatmasker.repeats.gff3: Generic Feature Format Version 3 (GFF3) of tandem repeats annotated in the N. guentheri genome.
Gene-specific data
Ngu_AMHR2_XY_alleles.fas: Manually curated sequences of X- and Y-linked alleles of the Nothobranchius guentheri amhr2 gene.NGU_female_poolseq_amhr2_extended.vcf: Nothobranchius guentheri female variant calls (freebayes).NGU_male_poolseq_amhr2_extended.vcf: Nothobranchius guentheri male variant calls (freebayes).NGU_male_poolseq_amhr2_extended_curated.vcf: Nothobranchius guentheri male variant calls (freebayes), manually curated with unsupported variants assigned Qual 0.Ngu_male_poolseq_amhr2_reads_sorted.bam: amhr2 gene alignment file of male Pool-seq data (N. guentheri) mapped to pseudochromosomes (RaGOO, N. furzeri reference).Ngu_female_poolseq_amhr2_reads_sorted.bam: amhr2 gene alignment file of female Pool-seq data (N. guentheri) mapped to pseudochromosomes (RaGOO, N. furzeri reference).
Genome de novo assembly and its quality evaluation
The F. thierryi long reads were first corrected using the Illumina short reads. Then, several options were used for genome assembly using Flye v2.8 (Kolmogorov et al. 2019) with the "––nano-raw" and "--min-overlap 5000“ option giving the best assembly in terms of contiguity and BUSCO (Benchmarking Universal Single-Copy Orthologs) completeness. The assembly underwent one round of long read polishing via medaka (https://github.com/nanoporetech/medaka) followed by a round of short read polishing using NextPolish (Hu et al. 2020) with Illumina reads. Haplotypic duplicates were then removed using the purge_dups v1.0.1 tool (Guan et al. 2020). Assembly quality was assessed using BUSCO v5 with the Actinopterygii database (odb_10) (Manni et al. 2021). Contamination checks were performed using BlobTools v1.0 (Laetsch & Blaxter 2017), and any contigs associated with non-target organisms were eliminated using the "seqfilter" function. The best result for N. guentheri assembly was achieved using Flye v2.8 (Kolmogorov et al. 2019) with the "––pacbio-raw" option, genome size not defined, and without further polishing or de-duplication. Assembly quality was verified with BUSCO v5 with the Actinopterygii database (odb_10).
Genome annotation
Functional and structural annotations were conducted using the GenSAS v6.0 pipeline (Humann et al. 2019). Repetitive sequences were identified employing RepeatModeler2 (Flynn et al. 2020) utilizing the RMBlast search engine and modules including TFR v4.09, RECON, and RepeatScout v1.0.5. Additionally, TAREAN (Novák et al. 2017) was utilized for satDNA annotation. All consensus sequences marked as satellites by TAREAN, regardless of confidence levels, were integrated into a custom database as dimers to enhance the satellite DNA annotation. RepeatMasker v4.1.1 (Smit et al. 2013–2015, available at http://www.repeatmasker.org) with the NCBI/RMBlast search engine was employed for repeat annotation. This involved a combination of the newly identified repeats from RepeatModeler2 and the custom database containing the satDNA sequences from TAREAN.
RNA-seq data were used for N. guentheri genome annotation. To that end, total RNA was extracted from brain tissue dissected from N. guentheri. Subsequently, the 150 bp Illumina reads were aligned to the respective genomes using STAR v2.7.7 (Dobin & Gingeras 2015). The genome index required for mapping was generated using the following command: STAR --runThreadN 9 --runMode genomeGenerate --genomeDir ./genomedir --genomeFastaFiles <genome> --genomeSAindexNbases 13. Following index generation, mapping was executed using the command: STAR --runThreadN 9 --genomeDir ./genomedir --readFilesIn ./Forward.fq ./Reverse.fq. The resulting SAM file was converted to BAM format using the SAMtools suite (v1.11) (Li et al. 2009). The generated BAM file was then utilized for gene prediction through BRAKER2 with default settings, which incorporates Augustus and GeneMark-EP (Lomsadze et al. 2014; Brůna et al. 2021). For the annotation of Fundulosoma thierryi genome assembly, Augustus v3.3.1 (Stanke & Morgenstern 2005) and GeneMark-ES were used directly, without RNA-Seq evidence to guide the process. Furthermore, tRNA and rRNA sequences were identified in all assemblies using tRNAscan-SE v2.0.7 (Chan & Lowe 2019) and RNAmmer v1.2 (Lagesen et al. 2007), respectively.
Reference guided scaffolding
To obtain pseudochromosome level assemblies of both F. thierryi and N. guentheri, we aligned genome assembly contigs to chromosomes of the N. furzeri reference (GenBank acc. no. GCA_014300015.1) using Minimap2 v2.17 (Li 2018) and sorted and oriented the contigs into pseudomolecules using RaGOO (Alonge et al. 2019) with “-b -s” options.
Pool-seq analysis
To identify male-specific (MSY) regions, we generated pooled samples of genomic DNA from N. guentheri (28 males, 29 females) and F. thierryi (20 males, 11 females) and sequenced them in paired-end mode on the Illumina platform. The Illumina pools were quality checked with FastQC v0.11.5 (Andrews et al. 2010) and filtered with the “--nextseq-trim=20 --minimum-length=100” options using cutadapt v1.15 (Martin 2011) and trimmed with Trimmomatic v0.36 (Bolger et al. 2014) with following parameters: “SLIDINGWINDOW:4:25 MINLEN:100 HEADCROP:4 CROP:140”. The trimmed and filtered reads from female and male pools were mapped separately to the reference genomes using BWA-MEM v0.7.17 (Li & Durbin 2009) with default parameters. Using Picard Toolkit (https://broadinstitute.github.io/picard/), we sorted resulting bam files by coordinate (“SortSam SORT_ORDER=coordinate”) and removed PCR duplicates (“MarkDuplicates REMOVE_DUPLICATES=true REMOVE_SEQUENCING_DUPLICATES=true”). Subsequently, we generated a file with the nucleotide composition of all genomic positions using the pileup function of the software Pooled Sequencing Analysis for Sex Signal (PSASS; Feron & Jaron 2021). We used PSASS to identify non-overlapping 50 kb windows enriched in sex-specific SNPs, using the following parameters: “--min-depth 10, --freq-het 0.5 --range-het 0.15 --freq-hom 1 --range-hom 0.05 --window-size 50000 --output-resolution 50000 --group-snps”.
Coverage analysis
The short Illumina reads from three male and female samples of both F. thierryi and N. guentheri were quality checked with FastQC v0.11.5 (Andrews et al. 2010) and filtered with “--nextseq-trim=20 --minimum-length=100” options using cutadapt v1.15 (Martin 2011) and trimmed with Trimmomatic v0.36 (Bolger et al. 2014) with following parameters: “SLIDINGWINDOW:4:25 MINLEN:100 HEADCROP:10 CROP:130”. Repetitive sequences pseudochromosome-level assemblies were identified by RepeatModeler v1.0.11 (Smit et al. 2008–2015, available at http://www.repeatmasker.org) and annotated by RepeatMasker v4.0.7 (Smit et al. 2013–2015, available at http://www.repeatmasker.org) with the NCBI search. Filtered reads were mapped to the corresponding masked reference genome via Bowtie2 v2.2.9 (Langmead & Salzberg 2012) with the “--very-sensitive-local --no-discordant --no-mixed” parameters and the outputs were compressed to BAM format using SAMtools view (v1.3.1; Li et al. 2009) and merged according to sex using SAMtools merge. The resulting BAM files were then parsed using utilities from the Bedtools suite v2.25.0 (Quinlan & Hall 2010). A genome file was parsed from the BAM files using SAMtools view and divided into 50 kbp sliding windows using Bedtools makewindows with “-w 50000 –s 50000” parameters. The merged BAM files were sorted with SAMtools sort and converted to BED format using Bedtools bamtobed “-split”. Finally, the per-base coverage of aligned sequences within 50 kbp windows spanning the genome was computed using Bedtools coverage. In both sexes, coverage depths for each scaffold were normalized by mean coverage across scaffolds and compared between sexes, formulated as the Log2 of the male:female (M:F) coverage ratio. The resulting data, together with those from Pool-seq analysis, were visualised using “SexGenomicsToolkit/sgtr” R package (https://github.com/SexGenomicsToolkit/sgtr).
Variant detection
To analyse allelic variants in the amhr2 gene, we mapped paired-end short read Illumina (Bentley et al., 2008) Pool-seq data to the reference N. guentheri genome and called the variants using Freebayes software (Garrison & Marth, 2012). Male and female Pool-seq data were trimmed and filtered using TrimGalore v0.6.2 (Krueger et al. 2023) using the following flags: “--fastqc --clip_R1 10 --clip_R2 10 --three_prime_clip_R1 10 --three_prime_clip_R2 10 --paired”. Effectivity of the trimming step was verified using FastQC v0.11.9 (2010). Trimmed reads were then mapped using Bowtie2 v2.3.5.1 (Langmead & Salzberg, 2012) with “--very-sensitive-local --no-mixed --no-discordant” flags. Resulting alignment files were used as an input for Freebayes v9.9.2 (Garrison & Marth, 2012) in default mode. By visualising the alignment files in Geneious Prime v2023.2.1, we obtained the compositions of the SNPs for males and females.
Haplotype-specific assembly of chr 13
To separately assemble male and female haplotypes of the putative sex determining region on the N. guentheri chromosome 13 (NguChr13), we have reassembled this chromosome using the trio binning mode. Original PacBio long reads used for the N. guentheri reference genome assembly were mapped back to it using Minimap2 v2.22 (Li, 2018) with the flags “-a -x map-pb”. From the resulting alignment file, only reads mapping in the chr13 were selected using Samtools v1.11 (Danecek et al. 2021). Subsequent conversion of alignment file into FASTQ format was done using Bedtools bamtofastq with default setting (Quinlan & Hall, 2010). Resulting reads were used as a sequence set for the new NguChr13 assembly performed using Canu v2.2 (Koren et al., 2017) with the following flags enabled “genomeSize=30m --pacbio [filename] -haplotypePAT [patname] -haplotypeMAT [matname]”. Parental Illumina short reads were used to enable the binning. Assembly process resulted in the creation of two chr13 haplotypes, one maternal and the other paternal.
##############################################################################################################################################
Alonge M, Soyk S, Ramakrishnan S, Wang X, Goodwin S, Sedlazeck FJ, Lippman ZB, Schatz MC. (2019). RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biology 20: 224. https://doi.org/10.1186/s13059-019-1829-6
Andrews S (2010). FastQC: a quality control tool for high throughput sequence data. http:// www.bioinformatics.babra ham.ac.uk/ proje cts/ fastqc/. Accessed 25 Sept 2019
Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, et al. (2008). Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456: 53–59. https://doi.org/10.1038/nature07517
Brůna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. (2021). BRAKER2: Automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genomics and Bioinformatics 3: lqaa108. https://doi.org/10.1093/nargab/lqaa108
Bolger AM, Lohse M, Usadel B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Chan PP, Lowe TM. (2019). tRNAscan-SE: searching for tRNA genes in genomic sequences. In M. Kollmar (Ed.), Gene prediction. Methods in molecular biology, vol. 1962 (pp. 1–14). Humana. https://doi.org/10.1007/978-1-4939-9173-0_1
Daneček P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, et al. (2021). Twelve years of SAMtools and BCFtools. GigaScience 10: giab008. https://doi.org/10.1093/gigascience/giab008
Dobin A, Gingeras TR. (2015). Mapping RNA-seq reads with STAR. Current Protocols in Bioinformatics 51: 11–14. https://doi.org/10.1002/0471250953.bi1114s51
Feron R, Jaron KS. (2021). SexGenomicsToolkit/PSASS: 3.1.0. https://doi.org/10.5281/zenodo.4442702
Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. (2020). RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of National Academy of Sciences USA 117: 9451–9457. https://doi.org/10.1073/pnas.1921046117
Garrison E, Marth G (2012). Haplotype-based variant detection from short-read sequencing (Version 2). arXiv. https://doi.org/10.48550/ARXIV.1207.3907
Guan D, McCarthy SA, Wood J, Howe K, Wang Y, Durbin R. (2020). Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics 36: 2896–2898. https://doi.org/10.1093/bioinformatics/btaa025
Hu J, Fan J, Sun Z, Liu S. (2020). NextPolish: A fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36: 2253–2255. https://doi.org/10.1093/bioinformatics/btz891
Humann JL, Lee T, Ficklin S, Main D. (2019). Structural and functional annotation of eukaryotic genomes with GenSAS. In M.
Kollmar (Ed.), Gene Prediction. Methods in Molecular Biology, vol. 1962. Humana. https://doi.org/10.1007/978-1-4939-9173-0_3
Kolmogorov M, Bickhart DM, Behsaz B, Gurevich A, Rayko M, Shin SB, Kuhn K, Yuan J, Polevikov E, Smith TPL, et al. (2020). metaFlye: scalable long-read metagenome assembly using repeat graphs. Nature Methods 17: 1103–1110. https://doi.org/10.1038/s41592-020-00971-x.
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. (2017). Canu: Scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation. Genome Research 27: 722–736. https://doi.org/10.1101/gr.215087.116
Krueger F, James F, Ewels P, Afyounian E, Weinstein M, Schuster-Boeckler B, Hulselmans G, sclamons. (2023). FelixKrueger/
TrimGalore: V0.6.10 - add default decompression path (0.6.10) [Computer software]. Zenodo. https://doi.org/10.5281/ZENODO.7598955
Laetsch DR, Blaxter ML. (2017). BlobTools: Interrogation of genome assemblies. F1000Research 6: 1287. https://doi.org/10.12688/f1000research.12232.1
Lagesen K, Hallin P, Rødland EA, Stærfeldt H-H, Rognes T, Ussery DW. (2007). RNAmmer: Consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Research 35: 3100–3108. https://doi.org/10.1093/nar/gkm160
Langmead B, Salzberg SL. (2012). Fast gapped-read alignment with bowtie 2. Nature Methods 9: 357–359. https://doi.org/10.1038/nmeth.1923
Li H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34: 3094–3100. https://doi.org/10.1093/bioinformatics/bty191
Li H, Durbin R. (2009). Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25: 1754–1760. https://doi.org/10.1093/bioinformatics/btp324
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics 25: 2078–2079. https://doi.org/10.1093/bioinformatics/btp352
Lomsadze A, Burns PD, Borodovsky M. (2014). Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Research 42: e119. https://doi.org/10.1093/nar/gku557
Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. (2021). BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Molecular Biology and Evolution, 38: 4647–4654. https://doi.org/10.1093/molbev/msab199
Martin M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17: 10–12. https://doi.org/10.14806/ej.17.1.200
Novák P, Ávila Robledillo L, Koblížková A, Vrbová I, Neumann P, Macas J. (2017). TAREAN: A computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res 45: e111–e111. https://doi.org/10.1093/nar/gkx257
Quinlan AR, Hall IM. (2010). BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics 26: 841–842. https://doi.org/10.1093/bioinformatics/btq033
Smit AF, Hubley R. (2008-2015). RepeatModeler-1.0. Retrieved from http://www.repeatmasker.org
Smit AFA, Hubley R, Green P. (2013–2015). RepeatMasker Open-4.0. Retrieved from http://www.repeatmasker.org. Accessed 2 Jan 2022
Stanke M, Morgenstern B. (2005). AUGUSTUS: A web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Research 33: W465–W467. https://doi.org/10.1093/nar/gki458
