Heliconius melpomene genome assembly version 2 Dryad repository =============================================================== This repository is an umbrella repository intended to provide a snapshot of all repositories relevant to Hmel2, the second version of the Heliconius melpomene genome assemblies, at time of publication. The following repositories are included: Genome.tar.gz - a snapshot of the Hmel2 distribution available from butterflygenome.org GitHub.tar.gz - a snapshot of the bespoke code written to reassemble the genome, available from https://github.com/johnomics/Heliconius_melpomene_version_2, commit f826fd1 from Dec 10, 2015 The following additional files are included only in this Dryad repository: Hmel_cross.Hmel1-1_primaryScaffolds.vcf.gz - variant calls for Heliconius melpomene mapping cross aligned to Hmel1-1 primary scaffolds (Hmel1-1_primaryScaffolds.fa) Hmel_cross.Hmel_haplotype_scaffolds.vcf.gz - variant calls for Heliconius melpomene mapping cross aligned to Hmel1-1 haplotype scaffolds (Hmel_haplotype_scaffolds.fas) Hmel_cross.linkage_map.db.gz - SQLite3 database containing marker information for linkage map (see below for table descriptions) Hmel_cross.linkage_map.clean.db.gz - SQLite3 database containing scaffold_map and chromosome_map tables used to reassemble genome, fixing all identified linkage errors, recalculating cM values and reversing chromosomes where necessary readdepth_gc_1kb.windows.Hmel2.adjusted.tsv - GC content and median read depths unadjusted and adjusted for GC content in 1kb windows for F1 father Hmel1-2.fa.gz - Hmel1-1 genome updated to correct all identified misassemblies Hmel1-2_haploid.fa.gz - Hmel1-2 genome collapsed by HaploMerger pacbio_falcon_assembly.fa.gz - initial assembly of PacBio reads with FALCON pacbio_haploid.fa.gz - FALCON PacBio assembly collapsed by HaploMerger Hmel1-2_pacbio_merge.fa.gz - Hmel1-2_haploid and pacbio_haploid merged by HaploMerger Eueides.Hmel2.vcf.gz - variant calls for Eueides cross aligned to Hmel2.fa DATABASE TABLES --------------- All tables present in Hmel_cross.linkage_map.db. Updated and corrected chromosome_map and scaffold_map tables used to reassemble genome available in Hmel_cross.linkage_map.clean.db. markers - VCF SNPs converted to marker format with summary statistics, generated by scaffoldgenome.pl. blocks - initial marker regions, generated by scaffoldgenome.pl. cleanblocks - refined marker regions, generated by clean_blocks.pl. mapblocks - final marker regions collapsed to Maternal and Paternal markers only, generated by build_linkage_maps.pl. chromosome_map - final corrected markers per chromosome, generated by build_linkage_maps.pl. scaffold_map - scaffold parts assigned to chromosomes, generated by build_linkage_maps.pl. markers: scaffold - name of Hmel1-1 scaffold position - SNP position on Hmel1-1 scaffold marker_type - assigned marker type taken from C115_marker_types.txt, or Reject parent_gt - called parental genotypes, following order in C115_marker_types.txt. A or B: homozygous for allele A or B. H: heterozygous for alleles A and B. '.': missing genotype parent_gqs - phred-scaled likelihoods for parents separated by colons, following order in C115_marker_types.txt parent_dps - read depths for parents separated by colons, following order in C115_marker_types.txt mq - mapping quality for SNP fs - Fisher strand bias for SNP p - p-value for root mean square test obs - observed value for root mean square test rms_pattern - expected value for root mean square test phase - phase of marker (1 means genotypes have been reversed to match neighbouring markers) pattern - raw segregation pattern for cross progeny consensus - cleaned segregation pattern for cross progeny error - for Rejects, reason for rejection blocks, cleanblocks, mapblocks: scaffold, start, end, length - Hmel1-1 scaffold block assigned to set of markers Intercross, Maternal, Paternal - overlapping segregation patterns assigned to block chromosome_map: chromosome, cm - marker location on map print - maternal marker for chromosome; as females do not recombine, this is a chromosome print original - raw paternal marker. There may be multiple raw markers for one cm value clean - corrected paternal marker, coloured red (A) and blue (B) to make recombinations more visible. There is one clean marker per cm value length - base pairs assigned to this original marker scaffold_map: chromosome, cm - marker location on map scaffold, start, end, length - Hmel1-1 scaffold block assigned to marker location John Davey johnomics@gmail.com 13 January 2016