Ion Torrent data for the genome assembly and phylogenomic placement of mitochondrial genomes with a focus on houndsharks (Chondrichthyes: Triakidae)
Data files
Jan 31, 2024 version files 4.25 MB
-
1_IonTorrent_NGS_Filtered_RawData.zip
-
README.md
Abstract
Here, we present the Ion Torrent® next-generation sequencing (NGS) data for five houndsharks (Chondrichthyes: Triakidae), which include Galeorhinus galeus (17,487 bp; GenBank accession number ON652874), Mustelus asterias (16,708; ON652873), Mustelus mosis (16,755; ON075077), Mustelus palumbes (16,708; ON075076), and Triakis megalopterus (16,746 bp; ON075075). All assembled mitogenomes encode 13 protein-coding genes (PCGs), two ribosomal (r)RNA genes, and 22 transfer (t)RNA genes (tRNALeu and tRNASer are duplicated), except for G. galeus which contains 23 tRNA genes where tRNAThr is duplicated. The data presented in this paper can assist other researchers in further elucidating the diversification of triakid species and the phylogenetic relationships within Carcharhiniformes (groundsharks) as mitogenomes accumulate in public repositories.
README
This README file was generated on 2024-01-31 by Jessica Winn
GENERAL INFORMATION
Title of the journal article that uses this data set: Ion Torrent data for the genome assembly and phylogenomic placement of mitochondrial genomes with a focus on houndsharks (Chondrichthyes: Triakidae).
Author Information:
A. Principle Investigator contact information
Name: Jessica C. Winn
Institution: Stellenbosch University
Address: Molecular Breeding and Biodiversity Group, Department of Genetics, Stellenbosch University, Stellenbosch, Western Cape, 7602, South Africa.
Email: jessica.winn16@gmail.com
ORCiD: https://orcid.org/0000-0003-1070-1276B. Co-investogator contact information
Name: Simo N. Maduna
Institution: Norwegian Institute of Bioeconomy Research
Address: Department of Ecosystems in the Barents Region, Svanhovd Research Station, Norwegian Institute of Bioeconomy Research, 9925 Svanvik, Norway.
Email: simo.maduna@nibio.no
ORCiD: https://orcid.org/0000-0002-9372-4360C. Co-investogator contact information
Name: Aletta E. Bester-van der Merwe
Institution: Stellenbosch University
Address: Molecular Breeding and Biodiversity Group, Department of Genetics, Stellenbosch University, Stellenbosch, Western Cape, 7602, South Africa.
Email: aeb@sun.ac.za
ORCiD: https://orcid.org/0000-0002-0332-7864Data collection:
Genomic DNA - Standard CTAB protocol or SDS-based lysis buffer (PL2) from the NucleoSpin Plant II mini kit (MACHEREY-NAGEL, Dueren, Germany).
DNA quality control - Qubit 4.0 fluorometer (ThermoFisher Scientific) and LabChip GXII Touch (PerkinElmer, Waltham, MA, USA).
Library preparation - Ion Plus Fragment Library Kit (ThermoFisher Scientific) according to the manufacturers protocol, Ion Xpress Plus gDNA Fragment Library Preparation User Guide (MAN0009847 K.0).
NGS Sequencing - Ion GeneStudio S5 Prime System and postprocessing with Torrent Suite version 5.16 under default settings at the Central Analytical Facility (CAF) at Stellenbosch University.Date of data collection (range): Finclip sampling (2014-2019), Ion Torrent NGS sequencing (2022), Data analysis (2022-2023)
Geographic location of data collection: Fin clip tissue samples of Galeorhinus galeus, Mustelus palumbes, and Triakis megalopterus were collected along the coast of South Africa. Mustelus asterias and Mustelus mosis were sampled off the coasts of Wales and the Sultanate of Oman respectively.
Information about funding sources that supported the collection of the data: MND210614611484, National Research Foundation, South Africa.
#########################################################################
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: CC0 1.0 Universal (CC0 1.0) Public Domain
Links to publications that cite or use the data:
Winn, J. C., Bester-van der Merwe, A. E., Maduna, S. N. (2024). Ion Torrent data for the genome assembly and phylogenomic placement of mitochondrial genomes with a focus on houndsharks (Chondrichthyes: Triakidae). Data in Brief. In Press.
Winn, J. C., Bester-van der Merwe, A. E., Maduna, S. N. (2024). Annotated bioinformatic pipelines for the genome assembly and phylogenomic placement of mitochondrial genomes with a focus on houndsharks (Chondrichthyes: Triakidae). Current Protocols. In Preparation.
Winn, J. C., Bester-van der Merwe, A. E., Maduna, S. N. (2024). A comprehensive phylogenomic study unveils evolutionary patterns and challenges in the mitochondrial genomes of Carcharhiniformes: A focus on Triakidae. Genomics. 11(1): 110771. doi:10.1016/j.ygeno.2023.110771.
- Links to other publicly accessible locations of the data:
Assembled mitochondrial genomes
Repository name: GenBank
Data identification number: ON075075, ON075076, ON075077, ON652873, and ON652874
Direct URL to data: https://www.ncbi.nlm.nih.gov/nuccore/ON075075; https://www.ncbi.nlm.nih.gov/nuccore/ON075076; https://www.ncbi.nlm.nih.gov/nuccore/ON075077; https://www.ncbi.nlm.nih.gov/nuccore/ON652873; https://www.ncbi.nlm.nih.gov/nuccore/ON652874.
Raw Ion Torrent® next-generation sequencing (NGS) data
Repository name: BioProject *
Data identification number: PRJNA997468
BioSample accessions: SAMN36680060, SAMN36680061, SAMN36680062, SAMN36680063, SAMN36680064
Direct URL to data: https://www.ncbi.nlm.nih.gov/bioproject/997468; https://www.ncbi.nlm.nih.gov/biosample/36680060; https://www.ncbi.nlm.nih.gov/biosample/36680061; https://www.ncbi.nlm.nih.gov/biosample/36680062; https://www.ncbi.nlm.nih.gov/biosample/36680063; https://www.ncbi.nlm.nih.gov/biosample/36680064.
* The data has been uploaded as a BioProject onto the SRA database, but it has been suppressed until the release of a related manuscript.
Links/relationships to ancillary data sets: None
Was data derived from another source? No
Recommended citation for this dataset:
Winn, Jessica; Bester-Van der Merwe, Aletta; Maduna, Simo (Forthcoming 2024). Ion Torrent data for the genome assembly and phylogenomic placement of mitochondrial genomes with a focus on houndsharks (Chondrichthyes: Triakidae) [Dataset]. Dryad. https://doi.org/10.5061/dryad.sj3tx969h
#########################################################################
DATA & FILE OVERVIEW
- File List:
A) 1_IonTorrent_NGS_Filtered_RawData.zip
Data_1_Galeorhinus_galeus_IonTorrent_Filtered_RawData.bam
Data_2_Mustelus_asterias_IonTorrent_Filtered_RawData.bam
Data_3_Mustelus_mosis_IonTorrent_Filtered_RawData.bam
Data_4_Mustelus_palumbes_IonTorrent_Filtered_RawData.bam
Data_5_Triakis_megalopterus_IonTorrent_Filtered_RawData.bam
Data_6_Galeorhinus_galeus_Sanger_Forward
Data_7_Galeorhinus_galeus_Sanger_Reverse
- Relationship between files, if important:
The GenBank files for the Triakidae species assembled in this study are the output files created during assembly of the raw Ion Torrent sequences (under 1_IonTorrent_NGS_Filtered_RawData).
Additional related data collected that was not included in the current data package: None
Are there multiple versions of the dataset? No
A. If yes, name of file(s) that was updated: NA
i. Why was the file updated? NA
ii. When was the file updated? NA
#########################################################################
DATA-SPECIFIC INFORMATION FOR: 1_Raw_IonTorrent_NGS_data
Data type: Raw Mitogenomic Ion Torrent® NGS data files in BAM format for Galeorhinus galeus, Mustelus asterias, Mustelus mosis, Mustelus palumbes and Triakis Megalopterus.
Data processing:
Adaptors and poor-quality bases (Phred score below 20) have been trimmed and reads shorter than 25 base pairs (bp) removed in Torrent Suite Version 5.16.
Raw reads were aligned to the Mustelus mustelus mitogenome (NC_039629.1) using the Geneious read mapper with medium sensitivity settings and five iterations in Geneious Prime (version 2019.1.3).
The reads that mapped to the reference mitogenome were then saved in BAM format as filtered Ion Torrent reads.
- Specialized formats or other abbreviations used: NA
#########################################################################
BIOINFORMATICS PIPELINE
The full bioinformatics pipeline and code describing how to process the data will be available as a Current Protocols manuscript: Winn, J. C., Bester-van der Merwe, A. E., Maduna, S. N. (2024). Annotated bioinformatic pipelines for the genome assembly and phylogenomic placement of mitochondrial genomes with a focus on houndsharks (Chondrichthyes: Triakidae).
Methods
Data collection
Genomic DNA extraction: Standard CTAB protocol or SDS-based lysis buffer (PL2) from the NucleoSpin Plant II mini kit (MACHEREY-NAGEL, Dueren, Germany); DNA quality control: Qubit 4.0 fluorometer (ThermoFisher Scientific) and LabChip® GXII Touch (PerkinElmer, Waltham, MA, USA); Library preparation: Ion Plus Fragment Library Kit (ThermoFisher Scientific) according to the manufacturer’s protocol, Ion Xpress™ Plus gDNA Fragment Library Preparation User Guide (MAN0009847 K.0); NGS Sequencing: Ion GeneStudio™ S5 Prime System and postprocessing with Torrent Suite version 5.16 under default settings at the Central Analytical Facility (CAF) at Stellenbosch University.
Data processing
For the five houndshark species for which sequencing data was generated, sequence quality was checked in FastQC, adaptors and poor-quality bases (Phred score below 20) were trimmed, and reads shorter than 25 bp were removed in Torrent Suite Version 5.16. Raw reads were aligned to the Mustelus mustelus mitogenome (NC_039629.1) using the Geneious read mapper with medium sensitivity settings and five iterations in Geneious Prime (version 2019.1.3). The reads that mapped to the reference mitogenome were then saved in BAM format as filtered Ion Torrent reads to use for mitogenome assembly in SPAdes v.3.15 with the input set for unpaired Ion Torrent reads with 8 threads, kmers 21,33,55,77,99,127, the careful option to reduce the number of mismatches and short indels and all other parameters left as default.
Usage notes
BAM: Raw filtered Ion Torrent® NGS data files in BAM format can be viewed in a sequence analysis software. Here we use Geneious Prime v.2019.1.3 and SPAdes v.3.15