High quality genomes produced from single MinION flow cells clarify polyploid and demographic histories of critically endangered Fraxinus (ash) species
Data files
Dec 18, 2023 version files 14.54 GB
-
Figure_1_Data.zip
-
Figure_2_Data.zip
-
Figure_3_S13_Data.zip
-
Figure_S1_Data.zip
-
Figure_S10_Data.zip
-
Figure_S11_Data.zip
-
Figure_S12_Data.zip
-
Figure_S14_Data.zip
-
Figure_S15_Data.zip
-
Figure_S16_Data.zip
-
Figure_S2_Data.zip
-
Figure_S3_Data.zip
-
Figure_S4_Data.zip
-
Figure_S5_Data.zip
-
Figure_S6_Data.zip
-
Figure_S7_Data.zip
-
Figure_S8_Data.zip
-
Figure_S9_Data.zip
-
Forsythia_suspensa_v1.1.cds.fasta.gz
-
Forsythia_suspensa_v1.1.fasta.gz
-
Forsythia_suspensa_v1.1.gff.gz
-
Forsythia_suspensa_v1.1.longest_iso.gff.gz
-
Forsythia_suspensa_v1.1.longest_iso.protein.fasta.gz
-
Forsythia_suspensa_v1.1.protein.fasta.gz
-
Fraxinus_americana_v0.2.1.cds.fasta.gz
-
Fraxinus_americana_v0.2.1.fasta.gz
-
Fraxinus_americana_v0.2.1.gff.gz
-
Fraxinus_americana_v0.2.1.longest_iso.gff.gz
-
Fraxinus_americana_v0.2.1.longest_iso.protein.fasta.gz
-
Fraxinus_americana_v0.2.1.protein.fasta.gz
-
Fraxinus_americana_v0.2.1.repeats.gff.gz
-
Fraxinus_americana_v0.fasta.gz
-
Fraxinus_americana_v0.gff.gz
-
Fraxinus_americana_v0.proteins.fasta.gz
-
Fraxinus_americana_v0.repeats.gff.gz
-
Fraxinus_americana_v1.fasta.gz
-
Fraxinus_americana_v1.gff.gz
-
Fraxinus_americana_v1.protein.fasta.gz
-
Fraxinus_americana_v1.repeats.gff.gz
-
Fraxinus_americana_v1Ragtag.cds.fasta.gz
-
Fraxinus_americana_v1Ragtag.fasta.gz
-
Fraxinus_americana_v1Ragtag.gff.gz
-
Fraxinus_americana_v1Ragtag.longest_iso.gff.gz
-
Fraxinus_americana_v1Ragtag.longest_iso.protein.fasta.gz
-
Fraxinus_americana_v1Ragtag.protein.fasta.gz
-
Fraxinus_nigra_v0.2.1.cds.fasta.gz
-
Fraxinus_nigra_v0.2.1.fasta.gz
-
Fraxinus_nigra_v0.2.1.gff.gz
-
Fraxinus_nigra_v0.2.1.longest_iso.gff.gz
-
Fraxinus_nigra_v0.2.1.longest_iso.protein.fasta.gz
-
Fraxinus_nigra_v0.2.1.protein.fasta.gz
-
Fraxinus_nigra_v0.2.1.repeats.gff.gz
-
Fraxinus_nigra_v0.fasta.gz
-
Fraxinus_nigra_v0.gff.gz
-
Fraxinus_nigra_v0.protein.fasta.gz
-
Fraxinus_nigra_v0.repeats.gff.gz
-
Fraxinus_nigra_v1.fasta.gz
-
Fraxinus_nigra_v1.gff.gz
-
Fraxinus_nigra_v1.protein.fasta.gz
-
Fraxinus_nigra_v1.repeats.gff.gz
-
Fraxinus_nigra_v1Ragtag.cds.fasta.gz
-
Fraxinus_nigra_v1Ragtag.fasta.gz
-
Fraxinus_nigra_v1Ragtag.gff.gz
-
Fraxinus_nigra_v1Ragtag.longest_iso.gff.gz
-
Fraxinus_nigra_v1Ragtag.longest_iso.protein.fasta.gz
-
Fraxinus_nigra_v1Ragtag.protein.fasta.gz
-
Fraxinus_pennsylvanica_v0.fasta.gz
-
Fraxinus_pennsylvanica_v0.gff.gz
-
Fraxinus_pennsylvanica_v0.protein.fasta.gz
-
Fraxinus_pennsylvanica_v0.repeats.gff.gz
-
Fraxinus_pennsylvanica_v1.4.1.cds.fasta.gz
-
Fraxinus_pennsylvanica_v1.4.1.fasta.gz
-
Fraxinus_pennsylvanica_v1.4.1.gff.gz
-
Fraxinus_pennsylvanica_v1.4.1.longest_iso.gff.gz
-
Fraxinus_pennsylvanica_v1.4.1.longest_iso.protein.fasta.gz
-
Fraxinus_pennsylvanica_v1.4.1.protein.fasta.gz
-
Fraxinus_pennsylvanica_v1.4.1.repeats.gff.gz
-
Fraxinus_pennsylvanica_v1.fasta.gz
-
Fraxinus_pennsylvanica_v1.gff.gz
-
Fraxinus_pennsylvanica_v1.protein.fasta.gz
-
Fraxinus_pennsylvanica_v1.repeats.gff.gz
-
Fraxinus_pennsylvanica_v1Ragtag.cds.fasta.gz
-
Fraxinus_pennsylvanica_v1Ragtag.fasta.gz
-
Fraxinus_pennsylvanica_v1Ragtag.gff.gz
-
Fraxinus_pennsylvanica_v1Ragtag.longest_iso.gff.gz
-
Fraxinus_pennsylvanica_v1Ragtag.longest_iso.protein.fasta.gz
-
Fraxinus_pennsylvanica_v1Ragtag.protein.fasta.gz
-
Osmanthus_fragrans_v1.1.cds.fasta.gz
-
Osmanthus_fragrans_v1.1.fasta.gz
-
Osmanthus_fragrans_v1.1.gff.gz
-
Osmanthus_fragrans_v1.1.longest_iso.gff.gz
-
Osmanthus_fragrans_v1.1.longest_iso.protein.fasta.gz
-
Osmanthus_fragrans_v1.1.protein.fasta.gz
-
README.md
Abstract
With populations of threatened and endangered species declining worldwide, efforts are being made to generate high-quality genomic records of these species before they are lost forever. Here, we demonstrate that data from single Oxford Nanopore Technologies (ONT) MinION flow cells can, even in the absence of highly accurate short DNA-read polishing, produce high-quality de novo plant genome assemblies adequate for downstream analyses, such as synteny and ploidy evaluations, paleodemographic analyses, and phylogenomics. This study focuses on three North American ash tree species in the genus Fraxinus (Oleaceae) that were recently added to the International Union for Conservation of Nature (IUCN) Red List as critically endangered. Our results support a whole genome triplication at the base of the Oleaceae as well as a subsequent whole genome duplication shared by Syringa, Osmanthus, Olea, and Fraxinus. Finally, we demonstrate the use of ONT long-read sequencing data to reveal patterns in demographic history.
README: High quality genomes produced from single MinION flow cells clarify polyploid and demographic histories of critically endangered Fraxinus (ash) species
The following data are genome assemblies and annotations generated in this study. In the case of v0.2.1, 1.4.1, and 1.1, the genomes were previously published and we generated new annotations using GeMoMa.
Description of the data and file structure
Figure_1_Data.zip #contains data for Figure 1
|-- Figure_1a #MCScan files for generating the karyotype plot in Figure 1a
| |-- *.bck #file generated by MCScan for pairwise syneny search
| |-- *.bed #CDS sequence coordinates on assembly
| |-- *.cds #CDS sequence fasta file
| |-- *.des #file generated by MCScan for pairwise syneny search
| |-- *.prj #file generated by MCScan for pairwise syneny search
| |-- *.sds #file generated by MCScan for pairwise syneny search
| |-- *.ssp #file generated by MCScan for pairwise syneny search
| |-- *.suf #file generated by MCScan for pairwise syneny search
| |-- *.tis #file generated by MCScan for pairwise syneny search
| |-- *.anchors #seed synteny blocks
| |-- *.anchors.new #genes within each synteny block in *.anchors.simple file
| |-- *.anchors.simple #more succinct and editable form of the .anchors file
| |-- {sp1}.{sp2}.depth.pdf #syntenic gene depth between sp1 and sp2
| |-- .last #raw LAST output
| |-- .last.filtered #filtered LAST output
| |-- .lifted.anchors #additional anchors to form the final synteny blocks
| |-- {sp1}.{sp2}.pdf #syntenic dot plot between sp1 and sp2
| |-- karyotype..pdf #karyotype plot
| |-- layout. #figure layout file for karyotype plot
| |-- seqids. #sequence identification file for karyotype plot
|-- *.xlsx #fractionation bias plot source data for each Fraxinus assembly against Vitis vinifera
Figure_2_Data.zip #RDS data for recreating PSMCR plot (The following variables derive from manually set parameters: niters = the number of iterations; n = number of time intervals; maxT = the largest possible value for time to the most recent common ancestor. The following variables are automatically calculated and are not meant to be manually utilized/accessed: n_free_lambdas = Number of free parameters; logLik = log-likelihood; lk = log-likelihood; EMQ = EM-Q scores before and after; RI = relative information; Cpi = normalization factor; theta0 = per-site mutation rate; rho0 = per-site recombination rate; t_k = time scaled to generations; lambda_k = relative population size. RDS files contain 100 bootstrap replicates)
Figure 3_S13_Data.zip #contains data for for Figure 3 and Figure S13
|-- Figure_3a_S13a #contains full orthofinder output for Figure 3a and Figure S13a
| |-- Citation.txt #citation info
| |-- Comparative_Genomics_Statistics.tar.gz #run statistics
| |-- Gene_Duplication_Events.tar.gz #summary of gene duplication events
| |-- Gene_Trees.tar.gz #original gene trees for each orthogroup
| |-- Log.txt #run log
| |-- Orthogroup_Sequences.tar.gz #sequences for the genes in each orthogroup
| |-- Orthogroups.tar.gz #info on each orthogroup
| |-- Orthologues.tar.gz #orthologues for every Sp1 vs Sp2 comparison
| |-- Phylogenetic_Hierarchical_Orthogroups.tar.gz #genes represented at each node of the species/gene duplication tree
| |-- Phylogenetically_Misplaced_Genes.tar.gz #genes that appear to be out of place in the gene tree
| |-- Putative_Xenologs.tar.gz #putative xenologs for each species
| |-- Resolved_Gene_Trees.tar.gz #gene trees with nodes labelled
| |-- Single_Copy_Orthologue_Sequences.tar.gz #orthogroups that contain exactly one gene sequence per species
| |-- Species_Tree.tar.gz #txt files for OrthoFinder species tree
|-- Figure_3b #contains Ksrates output data for Figure 3b
| |-- config_Fa.txt #configuration file for ks rate correction using Fraxinus americana
| |-- ortholog_distributions.tar.gz #orthologous gene data between each pair of species (Fa: Fraxinus_americana.fasta, Fp: Fraxinus_pennsylvanica.fasta, Fn: Fraxinus_nigra.fasta, Oe: Olea_europaea.fasta, Of: Osmanthus_fragrans.fasta, So: Syringa_oblata.fasta, Fs: Forsythia_suspensa.fasta, Js: Jasminum_sambac.fasta, Ca: Callicarpa_americana.fasta)
| |-- ortholog_ks_list_db.tsv #file storing the ortholog KS value lists between species pairs
| |-- ortholog_peak_db.tsv #file storing the KS mode estimate between species pairs
| |-- paralog_distributions #files generated during the wgd paralog KS estimation run for the focal species
| |-- rate_adjustment #collects the output files of the substitution rate-adjustment for each species relative to the focal species
|-- Figure_13b #single copy orthologue sequences tree for Figure 13b
| |-- *.log #ASTRAL log files\
| |-- *_cat_t1.tre #ASTRAL tree with quartet branch support values
| |-- *_cat_t3.tre #ASTRAL tree with local posterior probability branch support values
| |-- *_cat.tre #concatenated single copy orthologue sequences trees
| |-- *_tree.txt #individual single copy orthologue sequences trees
Figure_S1_Data.zip #NanoPlot reports for Oxford Nanopore Technologies reads
|-- Fraxinus_americana.summaryplots #NanoPlot output for Fraxinus americana
|-- Fraxinus_nigra.summaryplots #NanoPlot output for Fraxinus nigra
|-- Fraxinus_pennsylvanica.summaryplots #NanoPlot output for Fraxinus pennsylvanica
Figure_S2_Data.zip #read coverage histograms for primary (v0; a,c,e) and purged (v1; b,d,f) assemblies
|-- * #directories for assemblies
| |-- *.aln_stats.plot.png #read coverage histogram figure
| |-- *.aln_stats.stats.txt #haploidy stats
| |-- *.aln.sorted.bam.hist #histogram source data
Figure_S3_Data.zip #synonymous substitution (ks) data for primary assemblies (v0)
Figure_S4_Data.zip #ks data for purged assemblies (v1) and Huff et al. (2022) assemblies (v0.2.1)
Figure_S5_Data.zip #ks data for Osmanthus fragrans (v1.1) and Olea europaea (v1.1)
Figure_S6_Data.zip #MCScan files for generating the karyotype plot in Figure S6
|-- *.bck #file generated by MCScan for pairwise syneny search
|-- *.bed #CDS sequence coordinates on assembly
|-- *.cds #CDS sequence fasta file
|-- *.des #file generated by MCScan for pairwise syneny search
|-- *.prj #file generated by MCScan for pairwise syneny search
|-- *.sds #file generated by MCScan for pairwise syneny search
|-- *.ssp #file generated by MCScan for pairwise syneny search
|-- *.suf #file generated by MCScan for pairwise syneny search
|-- *.tis #file generated by MCScan for pairwise syneny search
|-- *.anchors #seed synteny blocks
|-- *.anchors.new #genes within each synteny block in *.anchors.simple file
|-- *.anchors.simple #more succinct and editable form of the .anchors file
|-- {sp1}.{sp2}.depth.pdf #syntenic gene depth between sp1 and sp2
|-- .last #raw LAST output
|-- .last.filtered #filtered LAST output
|-- .lifted.anchors #additional anchors to form the final synteny blocks
|-- {sp1}.{sp2}.pdf #syntenic dot plot between sp1 and sp2
|-- karyotype..pdf #karyotype plot
|-- layout. #figure layout file for karyotype plot
|-- seqids. #sequence identification file for karyotype plot
Figure_S7_Data.zip #contains data for Figure S7
|-- Figure_S7ad #MCScan files for generating the plots in Figure S7a and S7d
| |-- *.bck #file generated by MCScan for pairwise syneny search
| |-- *.bed #CDS sequence coordinates on assembly
| |-- *.cds #CDS sequence fasta file
| |-- *.des #file generated by MCScan for pairwise syneny search
| |-- *.prj #file generated by MCScan for pairwise syneny search
| |-- *.sds #file generated by MCScan for pairwise syneny search
| |-- *.ssp #file generated by MCScan for pairwise syneny search
| |-- *.suf #file generated by MCScan for pairwise syneny search
| |-- *.tis #file generated by MCScan for pairwise syneny search
| |-- *.anchors #seed synteny blocks
| |-- *.anchors.new #genes within each synteny block in *.anchors.simple file
| |-- *.anchors.simple #more succinct and editable form of the .anchors file
| |-- {sp1}.{sp2}.depth.pdf #syntenic gene depth between sp1 and sp2
| |-- *.last #raw LAST output
| |-- *.last.filtered #filtered LAST output
| |-- .lifted.anchors #additional anchors to form the final synteny blocks
| |-- {sp1}.{sp2}.pdf #syntenic dot plot between sp1 and sp2
| |-- karyotype..pdf #karyotype plot
|-- *.xlsx #fractionation bias plot source data for grape against jasmine
Figure_S8_Data.zip #MCScan files for generating Figure S8
|-- *.bck #file generated by MCScan for pairwise syneny search
|-- *.bed #CDS sequence coordinates on assembly
|-- *.cds #CDS sequence fasta file
|-- *.des #file generated by MCScan for pairwise syneny search
|-- *.prj #file generated by MCScan for pairwise syneny search
|-- *.sds #file generated by MCScan for pairwise syneny search
|-- *.ssp #file generated by MCScan for pairwise syneny search
|-- *.suf #file generated by MCScan for pairwise syneny search
|-- .tis #file generated by MCScan for pairwise syneny search
|-- .anchors #seed synteny blocks
|-- *.anchors.new #genes within each synteny block in *.anchors.simple file
|-- *.anchors.simple #more succinct and editable form of the .anchors file
|-- {sp1}.{sp2}.depth.pdf #syntenic gene depth between sp1 and sp2
|-- .last #raw LAST output
|-- .last.filtered #filtered LAST output
|-- .lifted.anchors #additional anchors to form the final synteny blocks
|-- {sp1}.{sp2}.pdf #syntenic dot plot between sp1 and sp2
|-- karyotype..pdf #karyotype plot
|-- layout. #figure layout file for karyotype plot
|-- seqids. #sequence identification file for karyotype plot
Figure_S9_Data.zip #contains data for Figure S9
|-- Figure_S9a #MCScan files for generating the plots in Figure S9a
| |-- *.bck #file generated by MCScan for pairwise syneny search
| |-- *.bed #CDS sequence coordinates on assembly
| |-- *.cds #CDS sequence fasta file
| |-- *.des #file generated by MCScan for pairwise syneny search
| |-- *.prj #file generated by MCScan for pairwise syneny search
| |-- *.sds #file generated by MCScan for pairwise syneny search
| |-- *.ssp #file generated by MCScan for pairwise syneny search
| |-- *.suf #file generated by MCScan for pairwise syneny search
| |-- *.tis #file generated by MCScan for pairwise syneny search
| |-- *.anchors #seed synteny blocks
| |-- {sp1}.{sp2}.depth.pdf #syntenic gene depth between sp1 and sp2
| |-- *.last #raw LAST output
| |-- .last.filtered #filtered LAST output
| |-- .lifted.anchors #additional anchors to form the final synteny blocks
| |-- {sp1}.{sp2}.pdf #syntenic dot plot between sp1 and sp2
|-- *.xlsx #fractionation bias plot source data for Jasminum sambac against Fraxinus americana
Figure_S10_Data.zip #RDS data for recreating PSMCR plot (The following variables derive from manually set parameters: niters = the number of iterations; n = number of time intervals; maxT = the largest possible value for time to the most recent common ancestor. The following variables are automatically calculated and are not meant to be manually utilized/accessed: n_free_lambdas = Number of free parameters; logLik = log-likelihood; lk = log-likelihood; EMQ = EM-Q scores before and after; RI = relative information; Cpi = normalization factor; theta0 = per-site mutation rate; rho0 = per-site recombination rate; t_k = time scaled to generations; lambda_k = relative population size. RDS files contain 100 bootstrap replicates)
Figure_S11_Data.zip #RDS data for recreating PSMCR plot (The following variables derive from manually set parameters: niters = the number of iterations; n = number of time intervals; maxT = the largest possible value for time to the most recent common ancestor. The following variables are automatically calculated and are not meant to be manually utilized/accessed: n_free_lambdas = Number of free parameters; logLik = log-likelihood; lk = log-likelihood; EMQ = EM-Q scores before and after; RI = relative information; Cpi = normalization factor; theta0 = per-site mutation rate; rho0 = per-site recombination rate; t_k = time scaled to generations; lambda_k = relative population size. RDS files contain 100 bootstrap replicates)
Figure_S12_Data.zip #RDS data for recreating PSMCR plot (The following variables derive from manually set parameters: niters = the number of iterations; n = number of time intervals; maxT = the largest possible value for time to the most recent common ancestor. The following variables are automatically calculated and are not meant to be manually utilized/accessed: n_free_lambdas = Number of free parameters; logLik = log-likelihood; lk = log-likelihood; EMQ = EM-Q scores before and after; RI = relative information; Cpi = normalization factor; theta0 = per-site mutation rate; rho0 = per-site recombination rate; t_k = time scaled to generations; lambda_k = relative population size. RDS files contain 100 bootstrap replicates)
Figure_S14_Data.zip #orthologous ks data between each Fraxinus assembly (v1) and Vitis vinifera
Figure_S15_Data.zip #orthologous ks data between each Fraxinus assembly (v1) and the F. pennsylvanica reference assembly (v1.4.1)
Figure_S16_Data.zip #contains data for Figure S16
|-- *.gencov #source data for read depth histogram plots
|-- *.png #read depth histograms generated from Purge Haplotigs for each primary Fraxinus assembly (v0)
Fraxinus_americana_v0.fasta.gz
Fraxinus_americana_v0.gff.gz
Fraxinus_americana_v0.proteins.fasta.gz
Fraxinus_americana_v0.repeats.gff.gz
Fraxinus_nigra_v0.fasta.gz
Fraxinus_nigra_v0.gff.gz
Fraxinus_nigra_v0.protein.fasta.gz
Fraxinus_pennsylvanica_v0.fasta.gz
Fraxinus_pennsylvanica_v0.gff.gz
Fraxinus_pennsylvanica_v0.protein.fasta.gz
Fraxinus_pennsylvanica_v0.repeats.gff.gz
Fraxinus_americana_v1.fasta.gz
Fraxinus_americana_v1.gff.gz
Fraxinus_americana_v1.protein.fasta.gz
Fraxinus_americana_v1.repeats.gff.gz
Fraxinus_nigra_v1.fasta.gz
Fraxinus_nigra_v1.gff.gz
Fraxinus_nigra_v1.protein.fasta.gz
Fraxinus_nigra_v1.repeats.gff.gz
Fraxinus_nigra_v0.repeats.gff.gz
Fraxinus_pennsylvanica_v1.fasta.gz
Fraxinus_pennsylvanica_v1.gff.gz
Fraxinus_pennsylvanica_v1.protein.fasta.gz
Fraxinus_pennsylvanica_v1.repeats.gff.gz
Fraxinus_americana_v1Ragtag.cds.fasta.gz
Fraxinus_americana_v1Ragtag.fasta.gz
Fraxinus_americana_v1Ragtag.gff.gz
Fraxinus_americana_v1Ragtag.longest_iso.gff.gz
Fraxinus_americana_v1Ragtag.longest_iso.protein.fasta.gz
Fraxinus_americana_v1Ragtag.protein.fasta.gz
Fraxinus_nigra_v1Ragtag.cds.fasta.gz
Fraxinus_nigra_v1Ragtag.fasta.gz
Fraxinus_nigra_v1Ragtag.gff.gz
Fraxinus_nigra_v1Ragtag.longest_iso.gff.gz
Fraxinus_nigra_v1Ragtag.longest_iso.protein.fasta.gz
Fraxinus_nigra_v1Ragtag.protein.fasta.gz
Fraxinus_pennsylvanica_v1Ragtag.cds.fasta.gz
Fraxinus_pennsylvanica_v1Ragtag.fasta.gz
Fraxinus_pennsylvanica_v1Ragtag.gff.gz
Fraxinus_pennsylvanica_v1Ragtag.longest_iso.gff.gz
Fraxinus_pennsylvanica_v1Ragtag.longest_iso.protein.fasta.gz
Fraxinus_pennsylvanica_v1Ragtag.protein.fasta.gz
Fraxinus_americana_v0.2.1.cds.fasta.gz
Fraxinus_americana_v0.2.1.fasta.gz
Fraxinus_americana_v0.2.1.gff.gz
Fraxinus_americana_v0.2.1.longest_iso.gff.gz
Fraxinus_americana_v0.2.1.longest_iso.protein.fasta.gz
Fraxinus_americana_v0.2.1.protein.fasta.gz
Fraxinus_americana_v0.2.1.repeats.gff.gz
Fraxinus_nigra_v0.2.1.cds.fasta.gz
Fraxinus_nigra_v0.2.1.fasta.gz
Fraxinus_nigra_v0.2.1.gff.gz
Fraxinus_nigra_v0.2.1.longest_iso.gff.gz
Fraxinus_nigra_v0.2.1.longest_iso.protein.fasta.gz
Fraxinus_nigra_v0.2.1.protein.fasta.gz
Fraxinus_nigra_v0.2.1.repeats.gff.gz
Fraxinus_pennsylvanica_v1.4.1.cds.fasta.gz
Fraxinus_pennsylvanica_v1.4.1.fasta.gz
Fraxinus_pennsylvanica_v1.4.1.gff.gz
Fraxinus_pennsylvanica_v1.4.1.longest_iso.gff.gz
Fraxinus_pennsylvanica_v1.4.1.longest_iso.protein.fasta.gz
Fraxinus_pennsylvanica_v1.4.1.protein.fasta.gz
Fraxinus_pennsylvanica_v1.4.1.repeats.gff.gz
Forsythia_suspensa_v1.1.cds.fasta.gz
Forsythia_suspensa_v1.1.fasta.gz
Forsythia_suspensa_v1.1.gff.gz
Forsythia_suspensa_v1.1.longest_iso.gff.gz
Forsythia_suspensa_v1.1.longest_iso.protein.fasta.gz
Forsythia_suspensa_v1.1.protein.fasta.gz
Osmanthus_fragrans_v1.1.cds.fasta.gz
Osmanthus_fragrans_v1.1.fasta.gz
Osmanthus_fragrans_v1.1.gff.gz
Osmanthus_fragrans_v1.1.longest_iso.gff.gz
Osmanthus_fragrans_v1.1.longest_iso.protein.fasta.gz
Osmanthus_fragrans_v1.1.protein.fasta.gz
Genome assemblies and annotations with version v0, v1, and v1Ragtag were generated in this study. Assemblies and annotations with v0 represent primary Flye assemblies. Assemblies and annotations with v1 represent haploid Flye assemblies generated using Purge Haplotigs. Assemblies and annotations with v1Ragtag represent the v1 assemblies scaffolded to chromosome-level using RagTag and the Huff et al. 2021 Fraxinus pennsylvanica assembly as a reference. Genome assemblies with versions 0.2.1 and 1.4.1 were not generated in this study. The annotations were publicly available, but in a format that couldn't be used in our analyses. We annotated these genomes using GeMoMa. Genome assemblies with version 1.1 were not generated in this study. New annotations were generated using GeMoMa due to the originals not being publicly available at the time the analyses were done.
Sharing/Access information
Links to other publicly accessible locations of the data:
Data was derived from the following sources:
- Huff, M., Seaman, J., Wu, D., Zhebentyayeva, T., Kelly, L.J., Faridi, N., Nelson, C.D., Cooper, E., Best, T., Steiner, K. and Koch, J., 2022. A highquality reference genome for Fraxinus pennsylvanica for ash species restoration and research. Molecular ecology resources, 22(4), pp.1284-1302.
- Li, L.F., Cushman, S.A., He, Y.X. and Li, Y., 2020. Genome sequencing and population genomics modeling provide insights into the local adaptation of weeping forsythia. Horticulture research, 7.
- Yang, X., Yue, Y., Li, H., Ding, W., Chen, G., Shi, T., Chen, J., Park, M.S., Chen, F. and Wang, L., 2018. The chromosome-level quality genome provides insights into the evolution of the biosynthesis genes for aroma compounds of Osmanthus fragrans. Horticulture research, 5.
Methods
Fraxinus pennsylvanica vouchers and material for DNA extraction were collected from Point Gratiot Park in Dunkirk, New York, USA. F. americana and F. nigra vouchers and material for DNA extraction were collected from the College Lodge Forest in Brocton, New York, USA. Samples were sequenced on an Oxford Nanopore Technologies’ GridION instrument utilizing MinION flowcells (version R9.4). Genome assemblies were generated using Flye and reduced to haploid assemblies using Purge Haplotigs. Genomes were additionally scaffolded using RagTag and the recent Fraxinus pennsylvanica reference assembly as a scaffolding reference. All annotations were generated with GeMoMa.
Usage notes
- Guppy v5.0.11
- Nanoplot v1.38.0
- Nanostat v1.5.0
- Flye v2.8.3
- Flye v2.9
- QUAST v5.0.2
- BUSCO v5.4.4
- GeMoMa v1.9
- AGAT v1.0.0
- Evidential Gene
- RepeatModeler v2.0.1
- RepeatMasker v4.0.1
- RagTag v2.1.0
- MCScan (https://github.com/tanghaibao/jcvi/wiki/MCscan-(Python-version))
- HapPy (https://github.com/AntoineHo/HapPy)
- SAMtools v1.14
- Purge Haplotigs v1.1.1
- Minimap2 v2.20
- bcftools version 1.14
- PSMCR (https://github.com/emmanuelparadis/psmcr)
- Orthofinder v2.5.4
- Ksrates v1.1.3