We present the development of a genomic library using RADseq (restriction site associated DNA sequencing) protocol for marker discovery that can be applied on evolutionary studies of the sugarcane borer Diatraea saccharalis, an important South American insect pest. A RADtag protocol combined with Illumina paired-end sequencing allowed de novo discovery of 12 811 SNPs and a high-quality assembly of 122.8M paired-end reads from six individuals, representing 40 Gb of sequencing data. Approximately 1.7 Mb of the sugarcane borer genome distributed over 5289 minicontigs were obtained upon assembly of second reads from first reads RADtag loci where at least one SNP was discovered and genotyped. Minicontig lengths ranged from 200 to 611 bp and were used for functional annotation and microsatellite discovery. These markers will be used in future studies to understand gene flow and adaptation to host plants and control tactics.
Final SNP set.
This file contains the final SNP set (after several filtering steps detailed at manuscript) in .vcf and .fasta format. Fasta file contains ~80bp sequences of each RADtag loci. After alignment of each sequences from same loci, SNP of each sample can be accessed.
final_SNP_set.zip
Minicontigs associated with RADtag loci.
Minicontigs associated with RADseq loci in .fasta format.
minicontigs_RADtag_loci.fa
Microsatellites found in minicontigs.
This file contains a table of identified microsatellite with contig ID, motif and coordinates of each motif, and a file with parameters used in MiSA for SSR search.
microsatellites.zip
Repetitive elements identified with RepeatMasker.
This file contains a table of the identified repetitive elements (.tbl), a file with masked contigs and repetitive elements (.fasta format) and a file with contig ID, repetitive element ID and coordinates.
RepeatMasker_minicontigs_RADtag_loci.zip
Blast2GO annotation file.
This .gff file contain the final annotation of minicontigs obtained using Blast2GO pipeline.
minicontigs_final_annotation_Blast2go.gff
Scripts and command line of each program used in this pipeline.
This file contains all necessary commands and scripts to reproduce the outputs uploaded here (SNP, minicontigs, microsatellites, repetitive element discovery and functional annotation.
All kmer=27 minicontigs
This file contains minicontigs assembled with all second reads we generated. Second reads correspond to reads generated from random sheared ends of a genomic fragment. Since reads with different start and endings position were generated from the same genomic regions it allows the assemble of minicontigs bigger than read size usually sequenced (bigger than ~80bp reads)
all_minicontigs_random_sheared_ends_kmer27.fa