Genetic diversity of the zigzag ladybird beetle, Cheilomenes sexmaculata F. (Coleoptera: Coccinellidae) with its distribution in India and implications for biological control
Data files
Dec 18, 2024 version files 128.09 MB
-
NGS_P6481_reference.fasta
19.30 MB
-
NGS_P6481.hmp.xls
108.78 MB
-
README.md
5.98 KB
Abstract
The zigzag beetle, Cheilomenes sexmaculata, the most important and abundant of all ladybird beetles, feeds on a diverse range of prey. The study was conducted understand the population structure of C. sexmaculata, individuals collected from five different zones, consisting of 25 subpopulations that uniformly representing India. From the ddRAD sequence data we have identified contigs, SNPs, and INDELs were identified for all these 25 subpopulations collected across the India.
README: Genetic diversity of the zigzag ladybird beetle, Cheilomenes sexmaculata F. (Coleoptera: Coccinellidae) with its distribution in India and implications for biological control
This readme file was generated on [2024-11-27] by [Narayana Bhat Devate]
GENERAL INFORMATION
Title of Dataset: Data from: Genetic diversity of the zigzag ladybird beetle, Cheilomenes sexmaculata F. (Coleoptera: Coccinellidae) with its distribution in India and implications for biological control
Author/Principal Investigator Information
Name: Dr H S Rakshith
ORCID:
Institution: ICAR-Indian Agricultural Research Institute, New Delhi, India
Address: ICAR-Indian Agricultural Research Institute, New Delhi, India-110012
Email: rakshithmails@gmail.com
Date of data collection: 2019-05-01 to 2019-06-30
Geographic location of data collection: India viz., Delhi (North), Nagpur (Central), Jorhat (East), Anand (West) and Bengaluru (South)
Information about funding sources that supported the collection of the data: Nil
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: Nil
Links to publications that cite or use the data: DOI:
Links to other publicly accessible locations of the data:
Links/relationships to ancillary data sets: Nil
Was data derived from another source?
If yes, list source(s): Nil
Recommended citation for this dataset: Rakshith et al., 2024, Data from: Genetic diversity of the zigzag ladybird beetle, Cheilomenes sexmaculata F. (Coleoptera: Coccinellidae) with its distribution in India and implications for biological control, Dryad, Dataset
DATA & FILE OVERVIEW
File List: NGS_P6481.hmp, HapMap file contains genetic information about SNP and INDELs at listed contig for all the 25 subpopulations of zigzag ladybird beetle (Cheilomenes sexmaculata) collected across india during 2019, NGS_P6481_reference, sequence file in FASTA format for each contig used in the study
Relationship between files, if important: SNPs and INDELs in Hap Map file is generated with reference to sequence information of contigs available at NGS_P6481_reference
Additional related data collected that was not included in the current data package: Nil
Are there multiple versions of the dataset?
If yes, name of file(s) that was updated: Nil
Why was the file updated?
When was the file updated?
Instrument- or software-specific information needed to interpret the data: NA
Standards and calibration information, if appropriate: NA
Environmental/experimental conditions: NA
Describe any quality-assurance procedures performed on the data: Nil
People involved with sample collection, processing, analysis and/or submission: H S Rakshith, Sachin S Suroshe, Amolkumar U Solankhe, Gopalakrishnan S, Keerthi MC, Narayana Bhat Devate, Sujatha G S, Hari Krishna and Ranjith Kumar Ellur
DATA-SPECIFIC INFORMATION FOR: [NGS_P6481_reference]
Number of variables: 69,365 Contigs
Number of cases/column: NA
Variable List: 69,365 Contigs sequence in FASTA format
Missing data codes: Missing entries are denoted as "NA"
Specialized formats or other abbreviations used: A - Adinine, G - Guanine, C - Cytosine, T - Thymine
Data format: FASTA
DATA-SPECIFIC INFORMATION FOR: [NGS_P6481.hmp]
Number of variables: 523,331
Number of cases/column: 11 identifier + reference alleles + 25 Subpopulations Alleles
Variable List: 523,331 SNP and INDELs
Missing data codes: Missing codes are written with the symbol "--"
Specialized formats or other abbreviations used: A - Adinine, G - Guanine, C - Cytosine, T - Thymine, '--' - Missing, '/'-Represents Alleles
SNP Data format: HapMap format - The Hapmap file format is a table which consists of 11 columns plus one column for each sample genotyped. The first row contains the header labels of samples, and each additional row contains all the information associated with a single SNP/INDELs from contig.
first 11 column contains details of -
1 : rs# : contains the identifier for contig alleles;
2 : alleles : contains alleles according to NCBI database dbSNP;
3 : chrom : contains the contig that the SNP/INDEL was mapped;
4 : pos : contains the respective position of this SNP on Contig;
5 : strand : contains the orientation of the SNP in the DNA strand. Thus, SNPs could be in the forward (+) or in the reverse (-) or (.) unknown orientation relative to the reference contig;
6 : assembly# : contains the version of reference sequence assembly (from NCBI);
7 : center : contains the name of genotyping center that produced the genotypes;
8 : protLSID contains the identifier for HapMap protocol;
9 : assayLSID contain the identifier HapMap assay used for genotyping;
10 : panelLSID contains the identifier for panel of individuals genotyped;
11 : QCcode contains the quality control for all entries
From 12th Column onward SNP/INDEL data of zigzag ladybird beetle, Cheilomenes sexmaculata -
12 : Reference : Contig reference allele for denovo assembley
13 : DL1 : Sample collected from IARI
14 : DL3 : Sample collected from Hissar
15 : DL4 : Sample collected from Karnal
16 : DL5 : Sample collected from Jhansi
17 : DL6 : Sample collected from Panipat
18 : GN2 : Sample collected from Anand
19 : J11 : Sample collected from Navsari
20 : J12 : Sample collected from Dandi
21 : J21 : Sample collected from Anand
22 : J31 : Sample collected from Junagadh
23 : J32 : Sample collected from Porbandar
24 : KA1 : Sample collected from Coimbatore
25 : KA2 : Sample collected from GKVK, Bengaluru
26 : KA4 : Sample collected from Mudigere
27 : KA5 : Sample collected from Tiptur
28 : KA6 : Sample collected from Coorg
29 : MA2 : Sample collected from Warud
30 : NA3 : Sample collected from Akola
31 : NA4 : Sample collected from chandrapura
32 : NA5 : Sample collected from Akola
33 : NA6 : Sample collected from Amravati
34 : NB1 : Sample collected from Nagpur
35 : WB1 : Sample collected from Cooch behar
36 : WB2 : Sample collected from Barapani
37 : WB3 : Sample collected from Jorhat
Methods
The adult C. sexmaculata were collected from five different localities belonging to various zones of India viz., Delhi (North), Nagpur (Central), Jorhat (East), Anand (West) and Bengaluru (South) from May to June 2019. The samples were stored in 100% ethanol and frozen at -70°C. Genomic DNA was isolated from each individual separately by grinding in liquid nitrogen, using the method described earlier (Kim et al., 2012). We employed, restriction-site associated DNA sequencing (RADseq) coupled with Illumina sequencing, produces high coverage of homologous SNP (Single Nucleotide Polymorphism) loci. Population genomic analyses from ddRADseq depend on denovoassembly of a set of reference contigs. Raw SNPs/INDELs calls are represented in a single Hap Map file.
Methods for processing the data: For samples 1-25, 100-1000ng of high-quality genomic DNA was completely digested using Sphl and Mluc1 in a 20-50ul reaction volume (2.4 U per enzyme, New England Biolabs (NEB), Ipswich, MA, USA) and incubated at 37 °C for 90 min. Adapter ligation was done with Adapters A and B by changing their sticky ends for SphI and Mluc1, respectively. The dual-indexed primers designed were used for the reactions. After adding the indexed primers in PCR, the obtained libraries were pooled based on concentration (Qubit 2.0 fluorometer analysis) and concentrated in a SpeedVac (Eppendorf, Hamburg, Germany). A manual size selection was applied (a range between 450 and 550 bp, which corresponds to DNA fragment size of interest between 310 and 410bp) in low-melting 1.5% agarose gel electrophoresis (Bio-Rad Laboratories, Hercules, CA, USA). The final libraries were quantified by Qubit 2.0 fluorometer (dsDNA kit, Thermo Fisher Scientific) and their quality was checked on a Fragment Analyzer system (DNA High Sensitivity kit, Agilent). The sequencing quality of each sample was checked using FastqC. The dDocent version 2.2.10 was used for this project. After dDocent checks, it was recognized the proper number of samples in the current directory, later dDocent, Trim Galore! Program was used to proceed with the quality trimming of sequence data. Without reference material, population genomic analyses from ddRADseq depend on denovoassembly of a set of reference contigs . After all executions of FreeBayes are completed, raw SNPs/INDELs calls are concatenated into a single variant call file (VCF) using VCF tools. VCF file that contains all SNPs, INDELs, MNPs and complex events that are called in 90% of all individuals with a minimum quality score of 30. This VCF file is converted to HapMap with TASSEL.