A turn in species conservation for hairpin banksias: Demonstration of oversplitting leads to a better management of diversity

Wilson, Trevor 1 ; Rosetto, Maurizio1 ; Bain, David2 ; Yap, Jia-Yee 1 ; Wilson, Peter1 ; Stimpson, Margaret3 ; Weston, Peter1 ; Croft, Larry4

Published Sep 13, 2022 on Dryad. https://doi.org/10.5061/dryad.69p8cz94x

Data files

Sep 13, 2022 version files 31.84 MB

README_WILSON_2022_DATA.txt

9.16 KB
Wilson_2022_chloroplast_HairpinBanksia_alignment.nex

11.78 MB
Wilson_2022_chloroplast_HairpinBanksia_metadata.csv

5.97 KB
Wilson_2022_genotype_HairpinBanksia_Bcunninghamiiclade.csv

5.61 MB
Wilson_2022_genotype_HairpinBanksia_Bspinulosaclade.csv

8.43 MB
Wilson_2022_genotype_HairpinBanksia_metadata.csv

38.04 KB
Wilson_2022_genotype_HairpinBanksia_total.csv

5.96 MB

Abstract

We generated SNP genotype data and chloroplast genomic data to test the current taxonomy and infer a population-scale evolutionary scenario for the Hairpin Banksias (B. collina, B. cunninghamii, B. neoanglica, B. spinulosa and B. vincentia) and outgroups using a sample-set comprehensive in its representation of morphological diversity and a two-and-a-half thousand kilometer distribution. Here, we provide an archive of these SNP genotype and chloroplast sequence alignment data.

Leaf material from silica dried material or herbarium specimens lodged at the National Herbarium NSW were sampled and are identified by the insitutional database number.

For nDNA analysis, DNA was extracted from each sample using the Plant DNA Extraction Protocol for DArT available from the Diversity Arrays Technology Pty Ltd (DArT PL) website (Rossetto et al. 2019). Samples were sent to DArT PL (Canberra, Australia) for the DArT PL genotype by sequencing analysis, according to the documented in-house procedure. All specimens were co-analysed first to create a total dataset. Following this, two additional datasets were created through separate reanalysis of the raw data for two separate subclades identified from network analysis of the total dataset.

For cpDNA analysis, leaf material was sent to Deakin Genomics Research and Discovery Facility at Deakin University (Geelong, Australia) for DNA extraction, preparation of genomic libraries consisting of paired-end reads (2 x 150 bp) and paired-end sequencing using the Illumina NextSeq 500 platform.

Consistent in silico assembly of chloroplast genomic DNA SNP detection across all Illumina Nextseq libraries were performed to generate a comparable dataset from paired-end libraries. Library-specific chloroplast genome sequences were constructed using de novo assembly of the NGS libraries relevant to each species using Organelle Assembler (http://metabarcoding.org/org.asm; Coissac et al., 2016). With the library-specific chloroplast genomic sequence, a consensus sequence was then created from each of the relevant paired-end libraries using CLC Bio Genomics Workbench 8.0 (CLC; http://www.clcbio.com). This first involved trimming the raw paired-end reads with the Quality Trimming Tool, then mapping the trimmed reads to the library-specific chloroplast genomic sequence using default settings. The resulting library-specific consensus sequence was remapped with the same quality trimmed reads using more stringent mapping parameters of 0.8 for the similarity and 0.9 for the length fraction to maintain a high-quality library-specific consensus sequence with read coverage greater than 5x. Each sequence was annotated using the gene prediction tool GeSeq (Tillich et al., 2017). The annotations were inspected manually using Geneious by making sure the position of the start and stop codons was correct.

An alignment of relevant library-specific consensus sequences was generated using MAUVE (Darling et al., 2004) in Geneious Pro 9.1.8. The alignment was checked by removing areas of low coverage and where dubious SNPs exist (i.e., SNPs in the first or second codons and repetitive regions).

REFERENCE:

Coissac, E., P. M. Hollingsworth, S. Lavergne and P. Taberlet. 2016. From barcodes to genomes: Extending the concept of DNA barcoding. Molecular Ecology 25: 1423–1428. https://doi.org/10.1111/mec.1354

Darling, A. C., B. Mau, F. R. Blattner and N. T. Perna. 2004. Mauve: multiple alignment of conserved genomic sequence with rearrangements. Genome research 14: 1394–1403. https://doi.org/ 10.1101/gr.2289704

Tillich, M., P. Lehwark, T. Pellizzer, E. S. Ulbricht-Jones, A. Fischer, R. Bock, and S. Greiner. 2017. GeSeq–versatile and accurate annotation of organelle genomes. Nucleic Acids Research 45: W6–W11. https://doi.org/10.1093/nar/gkx391

A turn in species conservation for hairpin banksias: Demonstration of oversplitting leads to a better management of diversity

Data files

Abstract

Methods

Usage notes

Works referencing this dataset