Genomic identification of direct seeding and evolutionary lineages by combining heterogeneous genomic resources
Data files
Aug 20, 2025 version files 1.56 GB
-
CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.anno.emapper.annotations.xlsx
4.19 MB
-
CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.anno.gtf
47.15 MB
-
CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.fasta.gz
145.40 MB
-
EvoLin_SNPs.vcf
413.57 MB
-
Genotyping_Error_ddRAD_WGS.vcf
112.67 MB
-
Phylogenetic_SNPs.vcf
17.78 MB
-
README.md
3.68 KB
-
Seeding_detection_SNPs.vcf
496.38 MB
-
SNP.panel.vcf
321.78 MB
Abstract
Background
Human-induced habitat changes threaten biodiversity, prompting large-scale restoration initiatives. Revegetation through direct seeding is common in agricultural and infrastructure construction projects, yet the provenance of seed material and its genetic impacts on natural populations remain underexplored. Introducing foreign ecotypes can lead to unintended consequences, as they may be adapted to different environmental conditions or represent distinct evolutionary lineages. In Switzerland, direct seeding is widely used to promote dry meadows, often using seeds of the Carthusian pink (Dianthus carthusianorum).
Results
To assess the extent and genetic effects of direct seeding and infer seed provenances, we combined genomic data from 446 samples collected in independent, smaller-scale studies. We assembled a chromosome-level reference genome to map reads and developed a panel of 48,299 representative single nucleotide polymorphisms (SNPs). We identified six evolutionary significant units (ESUs) within the European distribution range of D. carthusianorum. As biodiversity promotion efforts are often coordinated nationally, we focused on populations in Switzerland, where we found five ESUs: four occur naturally, and one was introduced from Eastern Europe. Our combined genomic data revealed that 15 of 31 randomly sampled populations across Switzerland (48.4%) originated from direct seeding. Allochthonous seed material was detected in eight populations (25.8%), with six of these showing admixture involving two to three ESUs.
Conclusions
Our results demonstrate the effectiveness of genomic approaches for identifying direct seeding and clarifying seed provenance, thereby supporting decision-making in national revegetation projects and emphasising the importance of using autochthonous seed sources.
Dianthus carthusianorum Reference Genome (CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.fasta.gz) from the paper
Summary
High-quality genome assembly and annotation for Dianthus carthusianorum (haplotype 1),
generated using PacBio HiFi reads and Omni-C scaffolding. Annotation was performed with BRAKER3
with short read RNAseq from leaf and flower bud tissues of the same individual
(extracted from BAM file sent by Dovetail/Cantata) and short read RNAseq from
young bud, intermediate bud, young flower, open flower, young leaf, and pollinated flower of
an individual from Wallis as well as the ViridiPlantaea partition of OrthoDB v11 as evidence. More details can be found in the manuscript.
The version here uploaded is the one used for all analyses and, which only differ in scaffold naming and orientation from ethDiCart_GR_1.1, which was submitted to NCBI (PRJNA1259412).
Reference genome
CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.fasta.gz
Structural annotation
CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.anno.gtf
Functional annotation
CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.anno.emapper.annotations.xlsx
Assembly Details
- File:
CARY_DIA_CAR_CH_CHANT_001_HiFi_OMNI-C_HAP1_COR.fasta.gz - COMPOSITION A = 157153546 (31.5%), C = 92322674 (18.5%), G = 92285134 (18.5%), T = 156903352 (31.5%), N = 6900 (0.0%), CpG = 30358188 (6.1%)
- SCAFFOLD sum = 498671606, n = 62, mean = 8043090.41935484, largest = 39410985, smallest = 11396
- SCAFFOLD N50 = 33218876, L50 = 8
- SCAFFOLD N60 = 32392048, L60 = 9
- SCAFFOLD N70 = 31813274, L70 = 11
- SCAFFOLD N80 = 30730960, L80 = 12
- SCAFFOLD N90 = 30263282, L90 = 14
- SCAFFOLD N100 = 11396, L100 = 62
- CONTIG sum = 498664706, n = 131, mean = 3806600.80916031, largest = 25413205, smallest = 1324
- CONTIG N50 = 11463605, L50 = 15
- CONTIG N60 = 9809675, L60 = 20
- CONTIG N70 = 8828776, L70 = 26
- CONTIG N80 = 6382660, L80 = 32
- CONTIG N90 = 4014208, L90 = 42
- CONTIG N100 = 1324, L100 = 131
- GAP sum = 6900, n = 69, mean = 100, largest = 100, smallest = 100
All 15 chromosomes (n = 15) are represented.
Dianthus carthusianorum SNP panel and SNP sets from the paper
Genomic identification of direct seeding and evolutionary lineages by combining heterogeneous genomic resources
Summary
This dataset contains a curated SNP panel and four SNP sets from ddRAD and WGS sequencing data in
Dianthus carthusianorum and related taxa. Each dataset is tailored for a specific purpose:
evolutionary lineage inference, seeding detection, phylogenetics, and genotyping error estimation.
SNP.panel.vcf
- Purpose: Serves as the reference panel for downstream SNP set generation
- Samples: 136 individuals (42 ddRAD-SE, 63 ddRAD-PE, 31 ddRAD-WGS replicates)
EvoLin_SNPs.vcf
- Purpose: Assess evolutionary lineages, genetic structure and divergence within D. carthusianorum
- Samples: 198 individuals (42 ddRAD-SE, 63 ddRAD-PE, 31 ddRAD-WGS, 62 WGS)
Phylogenetic_SNPs.vcf
- Purpose: Resolve broader phylogenetic relationships among Dianthus species
- Samples: 37 individuals (including 5 related species + D. carthusianorum ESUs)
Seeding_detection_SNPs.vcf
- Purpose: Detect direct seeding and admixture in Swiss populations
- Samples: 310 WGS individuals (10 per population)
Genotyping_Error_ddRAD_WGS.vcf
- Purpose: Assess error between ddRAD and WGS genotyping
- Samples: 31 ddRAD-WGS replicated individuals
