Data from: Museum skins enable identification of introgression associated with cytonuclear discordance
Data files
Jan 09, 2026 version files 329.51 MB
-
Dryad_brachyotis_group_cytonuclear_discordance_2.zip
329.45 MB
-
README_brachyotis_group_mito-nuclear_discordance.rtf
31.06 KB
-
README.md
26.92 KB
Abstract
Increased sampling of genomes and populations across closely related species has revealed that levels of genetic exchange during and after speciation are higher than previously thought. One obvious manifestation of such exchange is strong cytonuclear discordance, where the divergence in mitochondrial DNA (mtDNA) differs from that for nuclear genes more (or less) than expected from differences between mtDNA and nuclear DNA (nDNA) in population size and mutation rate. Given genome-scale datasets and coalescent modelling, we can now confidently identify cases of strong discordance and test specifically for historical or recent introgression as the cause. Using population sampling, combining exon capture data from historical museum specimens and recently collected tissues, we showcase how genomic tools can resolve complex evolutionary histories in the brachyotis group of rock-wallabies (Petrogale). In particular, applying population and phylogenomic approaches, we can assess the role of demographic processes in driving complex evolutionary patterns and assess the role of ancient introgression and hybridisation. We find that described species are well supported as monophyletic taxa for nDNA genes, but not for mtDNA, with cytonuclear discordance involving at least four operational taxonomic units (OTUs) across four species which diverged 183–278 kya. ABC modelling of nDNA gene trees supports introgression during or after speciation for some taxon pairs with cytonuclear discordance. Given substantial differences in body size between the species involved, this evidence for gene flow is surprising. Heterogenous patterns of introgression were identified but do not appear to be associated with chromosome differences between species. These and previous results suggest that dynamic past climates across the monsoonal tropics could have promoted reticulation among related species.
28 June 2021
GENERAL INFORMATION
Title: Museum skins enable identification of introgression associated with cytonuclear discordance
Principal Investigator: Dr Sally Potter, School of Natural Sciences, Macquarie University, Sydney, NSW, Australia
Date of data collection: 2013-2023
Geographic location of data collection: northern Australia, Kimberley region of Western Australia, Top End region of Northern Territory to just across the border into Queensland, Australia
Keywords: mito-nuclear discordance, introgression, rock-wallabies, museum skins, exon capture
Citation: Potter S, Moritz C, Piggott MP, Bragg JG, Afonso Silva AC, Bi K, McDonald-Spicer C,Turakulov R, Eldridge MDB (2023) Museum skins enable identification of introgression associated with cytonuclear discordance. Systematic Biology submitted.
DATA & FILE OVERVIEW
These data were generated to investigate the evolutionary relationships and divergence history of the brachyotis group of rock-wallabies from the Australian Monsoonal Tropics. Exon capture sequence data were generated for both modern tissues and historical museum samples and both population genomic and phylogenomic approaches used to evaluate introgression between species where mito-nuclear discordance was detected. Here we have four directories, containing 1) data used in the various analyses, 2) design of the exon capture experiment, 3) workflows used to assemble and clean the data, and 4) all supplementary material pertaining to this paper.
1. data -------------------------
File 1 Name: Metadata_sample_Petrogale_brachyotis
File 1 Description: A comma-separated (CSV) file of sample information including sample ID, Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra/) information - BioProject, BioSample and SRA number, geographical location.
Directory 1.1 Name: mitochondrial data/
Directory 1.1 Description: mitochondrial data files and summary of missing data for mitochondrial coding genes for brachyotis group
File 1.1.1 Name: Concat_mito_coding_genes_brachyotis_ALL_data.phy
File 1.1.1 Description: phylip file of mitochondrial protein coding genes for brachyotis samples
File 1.1.2 Name: Concat_mito_coding_genes_brachyotis_50%miss.phyFile 1.1.2 Description: phylip file of mitochondrial protein coding genes where samples have less than 50% missing data
File 1.1.3 Name: Missing_data_brachyotis_mtDNA.xlsx
File 1.1.3 Description: excel file summarising the number of missing alleles (N) and the percent of missing data for each mitochondrial protein coding gene for each individual
Directory 1.2 Name: nuclear exon datasets/
Directory 1.2 Description: directory of nuclear sequence files
Directory 1.2.1 Name: phased_h0_nuclear_alignments/Directory 1.2.1 Description: directory of phylip sequence files for each phased h0 nuclear exon used in the study, including a concatenated alignment of all exons, individuals are labelled with their index ID (refer to lib_sample_Petrogale_brachyotis)
Directory 1.2.2 Name: unphased_ambig_nuclear_alignments/Directory 1.2.2 Description: directory of phylip sequence files for each unphased nuclear exon used in the study (ambiguity codes used), including a concatenated alignment of all exons, individuals are labelled with their index ID (refer to lib_sample_Petrogale_brachyotis)
Directory 1.3 Name: PopGenome analyses/
Directory 1.3 Description: directory of all the Tajima D coalescent simulations and observed estimates from concatenated alignments for each OTU in the brachyotis group implemented in PopGenome
Directory 1.3.1 Name: brachyotis brachyotis EK
Directory 1.3.1 Description: directory of input and output files from PopGenome for the brachyotis brachyotis EK OTU
File 1.3.1.1 Name: concat_brachEK_h0_TajimaD.phy
File 1.3.1.1 Description: nuclear phased haplotype (h0) phylip sequence file for brachyotis brachyotis EK OTU used an input for PopGenome analysis
File 1.3.1.2 Name: R Console_brachEK_tajimad.txtFile 1.3.1.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
Directory 1.3.2 Name: brachyotis brachyotis WK/
Directory 1.3.2 Description: directory of input and output files from PopGenome for the brachyotis brachyotis WK OTU
File 1.3.2.1 Name: concat_brachWK_h0_TajimaD.phy
File 1.3.2.1 Description: nuclear phased haplotype (h0) phylip sequence file for brachyotis brachyotis WK OTU used an input for PopGenome analysis
File 1.3.2.2 Name: R Console_brachWK_tajimad.txtFile 1.3.2.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
Directory 1.3.3 Name: brachyotis victoriae/
Directory 1.3.3 Description: directory of input and output files from PopGenome for the brachyotis victoriae OTU
File 1.3.3.1 Name: concat_bvic_h0_TajimaD.phy
File 1.3.3.1 Description: nuclear phased haplotype (h0) phylip sequence file for brachyotis victoriae OTU used an input for PopGenome analysis
File 1.3.3.2 Name: R Console_bvic_tajimad.txtFile 1.3.3.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
Directory 1.3.4 Name: burbidgei N/
Directory 1.3.4 Description: directory of input and output files from PopGenome for the burbidgei N OTU
File 1.3.4.1 Name: concat_burbN_h0_TajimaD.phy
File 1.3.4.1 Description: nuclear phased haplotype (h0) phylip sequence file for burbidgei N OTU used an input for PopGenome analysis
File 1.3.4.2 Name: R Console_burbN_tajimad.txtFile 1.3.4.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
Directory 1.3.5 Name: concinna canescens/
Directory 1.3.5 Description: directory of input and output files from PopGenome for the concinna canescens OTU
File 1.3.5.1 Name: concat_concan_h0_TajimaD.phy
File 1.3.5.1 Description: nuclear phased haplotype (h0) phylip sequence file for concinna canescens OTU used an input for PopGenome analysis
File 1.3.5.2 Name: R Console_concan_tajimad.txtFile 1.3.5.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
Directory 1.3.6 Name: concinna monastria
Directory 1.3.6 Description: directory of input and output files from PopGenome for the concinna monastria OTU
File 1.3.6.1 Name: concat_conmon_h0_TajimaD.phy
File 1.3.6.1 Description: nuclear phased haplotype (h0) phylip sequence file for concinna monastria OTU used an input for PopGenome analysis
File 1.3.6.2 Name: R Console_conmon_tajimad.txtFile 1.3.6.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
Directory 1.3.7 Name: wilkinsi
Directory 1.3.7 Description: directory of input and output files from PopGenome for the wilkinsi OTU
File 1.3.7.1 Name: concat_wilk_h0_TajimaD.phy
File 1.3.7.1 Description: nuclear phased haplotype (h0) phylip sequence file for wilkinsi OTU used an input for PopGenome analysis
File 1.3.7.2 Name: R Console_wilk_tajimad.txtFile 1.3.7.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
Directory 1.3.8 Name: wilkinsi GULF
Directory 1.3.8 Description: directory of input and output files from PopGenome for the wilkinsi GULF OTU
File 1.3.8.1 Name: concat_wilkGU_h0_TajimaD.phy
File 1.3.8.1 Description: nuclear phased haplotype (h0) phylip sequence file for wilkinsi GULF OTU used an input for PopGenome analysis
File 1.3.8.2 Name: R Console_wilkGU_tajimad.txtFile 1.3.8.2 Description: R output from coalescent simulations of Tajima’s D in PopGenome
File 1.3.9 Name: Coalescent_simulations_TajimasD.xlsx
File 1.3.9 Description: excel file of coalescent simulation data for Tajima’s D values sorted to evaluate 0.05 significance, comparing the observed Tajima D to the 95%. The highlighted section represents the 0.05 cutoff to test for significance. Each tab represents a different OTU analysis as outlined in Files 1.3.1-1.3.8
Directory 1.3.10 Name: nDNA
Directory 1.3.10 Description: directory of input and output files from PopGenome for the entire nuclear haplotype dataset calculating Dxy
File 1.3.10.1 Name: Concat_brach_h0.phyFile 1.3.10.1 Description: Phylip alignment file of all brachyotis individuals for nuclear haplotypes
File 1.3.10.2 Name: R Console_Dxy_brach_nDNA.txtFile 1.3.10.2 Description: R output from Dxy analyses of brachyotis group nDNA in PopGenome
Directory 1.3.11 Name: mtDNA
Directory 1.3.11 Description: directory of input and output files from PopGenome for the entire mitochondrial dataset calculating Dxy
File 1.3.11.1 Name: 1.1.2 Concat_mito_coding_genes_brachyotis_50%miss.phyFile 1.3.11.1 Description: Phylip alignment file of all brachyotis individuals for mitochondrial data
File 1.3.11.2 Name: R Console_mtDNA_Dxy_brach.txtFile 1.3.11.2 Description: R output from Dxy analyses of brachyotis group mtDNA in PopGenome
Directory 1.4 Name: rangeExpansion analyses/
Directory 1.4 Description: directory containing folders for each OTU with input and output files from rangeExpansion analyses
Directory 1.4.1 Name: brachbrachEK/
Directory 1.4.1 Description: directory containing input and output files for rangeExpansion analysis for brachyotis brachyotis EK OTU
File 1.4.1.1 Name: bbrachEK_onesnp.snapp
File 1.4.1.1 Description: input file for rangeExpansion analysis including the single nucleotide polymorphisms coded as homozygous 0, heterozygous 1, homozygous 2 for alternate allele (missing ?) for each individual. Sample ID matches individuals in .csv file which has locality information. Outgroups used to call ancestral versus derived allele states are labelled as OOG numbers.
File 1.4.1.2 Name: bEK_coords.csv
File 1.4.1.2 Description: a csv file used in analysis for rangeExpansion which includes sample ID, latitude, longitude, the region name and column for outgroup (0=no, 1=yes)
File 1.4.1.3 Name: sample_key_bbrachEK_snapp.txt
File 1.4.1.3 Description: a text file which is a key matching the sample ID of individuals in this study to the sample code used in rangeExpansion analysis (refer to lib_sample_Petrogale_brachyotis for sample information)
File 1.4.1.4 Name: R Console bEK_PS.txt
File 1.4.1.4 Description: a text file of the R code and analysis run using rangeExpansion for the brachyotis brachyotis EK OTU
File 1.4.1.5 Name: Rplot_brachEK_expansion.pdf
File 1.4.1.5 Description: a pdf file of the rangeExpansion output plot showing a map of the samples and the location of the source and metrics of the range expansion
Directory 1.4.2 Name: brachbrachWK/
Directory 1.4.2 Description: directory containing input and output files for rangeExpansion analysis for brachyotis brachyotis WK OTU
File 1.4.2.1 Name: bbrachWK_onesnp.snapp
File 1.4.2.1 Description: input file for rangeExpansion analysis including the single nucleotide polymorphisms coded as homozygous 0, heterozygous 1, homozygous 2 for alternate allele (missing ?) for each individual. Sample ID matches individuals in .csv file which has locality information. Outgroups used to call ancestral versus derived allele states are labelled as OOG numbers.
File 1.4.2.2 Name: bbWK_coords.csv
File 1.4.2.2 Description: a csv file used in analysis for rangeExpansion which includes sample ID, latitude, longitude, the region name and column for outgroup (0=no, 1=yes)
File 1.4.2.3 Name: sample_key_bbrachWK_snapp.txt
File 1.4.2.3 Description: a text file which is a key matching the sample ID of individuals in this study to the sample code used in rangeExpansion analysis (refer to lib_sample_Petrogale_brachyotis for sample information)
File 1.4.2.4 Name: R Console bbrachWK_PS.txt
File 1.4.2.4 Description: a text file of the R code and analysis run using rangeExpansion for the brachyotis brachyotis EK OTU
Directory 1.4.3 Name: wilkinsi/
Directory 1.4.3 Description: directory containing input and output files for rangeExpansion analysis for wilkinsi OTU
File 1.4.3.1 Name: wilk_onesnp.snapp
File 1.4.3.1 Description: input file for rangeExpansion analysis including the single nucleotide polymorphisms coded as homozygous 0, heterozygous 1, homozygous 2 for alternate allele (missing ?) for each individual. Sample ID matches individuals in .csv file which has locality information. Outgroups used to call ancestral versus derived allele states are labelled as OOG numbers.
File 1.4.3.2 Name: wilk_coords.csv
File 1.4.3.2 Description: a csv file used in analysis for rangeExpansion which includes sample ID, latitude, longitude, the region name and column for outgroup (0=no, 1=yes)
File 1.4.3.3 Name: sample_key_wilk_snapp.txt
File 1.4.3.3 Description: a text file which is a key matching the sample ID of individuals in this study to the sample code used in rangeExpansion analysis (refer to lib_sample_Petrogale_brachyotis for sample information)
File 1.4.3.4 Name: R Console_wilk_PS.txt
File 1.4.3.4 Description: a text file of the R code and analysis run using rangeExpansion for the wilkinsi OTU
Directory 1.5 Name: demographic inference using linked selection/
Directory 1.5 Description:
File 1.5.1 Name: prepareFastaDILS.R
File 1.5.1 Description: an R script which takes fasta alignments and prepares them into input files for DILS analysis and takes a yaml template and information about the pairwise comparisons and creates yaml input files for DILS analysis
File 1.5.2 Name: template.yaml
File 1.5.2 Description: the yaml template file used in prepareFastaDILS.R to generate yaml input files for DILS analysis
Directory 1.5.3 Name: fasta files
Directory 1.5.3 Description: fasta alignment files for each exon of phased h0 haplotype data for all individuals in brachyotis group used to generate population input files for DILS analysis (used in prepareFastaDILS.R)
File 1.5.4 Name: samples_lineage.txt
File 1.5.4 Description: a text file assigning individuals to OTUs for the DILS analysis
File 1.5.5 Name: ComparisonsOfInterest.txt
File 1.5.5 Description: a text file outlining the two OTUs and the analysis type used in the DILS analysis, this is incorporated into prepareFastaDILS.R and the output files
Directory 1.5.6 Name: dils input fasta files/
Directory 1.5.6 Description: directory of fasta files used as input for dils analysis for each pairwise comparison of OTUs analysed in dils. The fasta alignment headers include the exon name | OTU name | sample ID | and haplotype information
Directory 1.5.7 Name: dils input yaml files/
Directory 1.5.7 Description: a directory of yaml input files for replicate runs for each pairwise comparison of OTUs analysed in dils. The yaml files indicate the parameter values and priors used for the analysis.
File 1.5.8 Name: runDILS_SLURM.sh
File 1.5.8 Description: a shell script used to run dils analysis on a cluster
Directory 1.5.9 Name: output/
Directory 1.5.9 Description: directory of .tar.gz output files for each replicate of the dils analysis for each pairwise comparison of OTUs. These are the demographic modelling DILS results from the paper.
File 1.5.10 Name: DILS_plottingResults.Rmd
File 1.5.10 Description: an R mark-down file which processes the output files and make plots of the results in R
File 1.5.11 Name: DILS_plottingResults.html
File 1.5.11 Description: an html file with the actual outputs from the analysis in R for plotting of dils results (produced from DILS_plottingResults.Rmd)
Directory 1.6 Name: MIGRATE analyses/
Directory 1.6 Description: directory containing input and output files for analyses in MIGRATE used to estimate migration and theta values
File 1.6.1 Name: brachyotis_migrate_input.phy
File 1.6.1 Description: input file used for MIGRATE analyses of Petrogale brachoytis group (excluding Petrogale wilkinsi OTUs) with sequences in phylip format and in format required by MIGRATE
File 1.6.2 Name: brachyotis_migrate_output.pdf
File 1.6.2 Description: output file generated from MIGRATE analysis with details of input parameters and output values for Petrogale brachyotis group excluding Petrogale wilkinsi OTUs
File 1.6.3 Name: wilkinsi_migrate_input.phy
File 1.6.3 Description: input file used for MIGRATE analyses of Petrogale wilkinsi OTUs with sequences in phylip format and in format required by MIGRATE
File 1.6.4 Name: wilkinsi_migrate_output.pdf
File 1.6.4 Description: output file generated from MIGRATE analysis with details of input parameters and output values for Petrogale wilkinsi OTUs
File 1.6.5 Name: bv_w_migrate_input.phy
File 1.6.5 Description: input file used for MIGRATE analyses of Petrogale brachyotis victoriae and P. wilkinsi Top End OTUs with sequences in phylip format and in format required by MIGRATE
File 1.6.6 Name: bv_w_out1.pdf
File 1.6.6 Description: output file generated from MIGRATE analysis with details of input parameters and output values for Petrogale brachyotis victoriae and P. wilkinsi Top End OTUs
Directory 1.7 Name: Chromosomes/
Directory 1.7 Description: directory containing directories of nuclear loci separated into known chromosome location and phylip sequences for each locus mapped tot that chromosome. Also directories for rearranged (R) and non-rearranged (NR) loci based on known chromosomal variation from karyotypes. Outputs of analyses are also stored in a directory.
Directory 1.7.1 Name: Chr1/
Directory 1.7.1 Description: directory of sequence alignments in phylip format which were mapped to chromosome 1 of the Petrogale penicillata genome
Directory 1.7.2 Name: Chr2/
Directory 1.7.2 Description: directory of sequence alignments in phylip format which were mapped to chromosome 2 of the Petrogale penicillata genome
Directory 1.7.3 Name: Chr3/
Directory 1.7.3 Description: directory of sequence alignments in phylip format which were mapped to chromosome 3 of the Petrogale penicillata genome
Directory 1.7.4 Name: Chr4/
Directory 1.7.4 Description: directory of sequence alignments in phylip format which were mapped to chromosome 4 of the Petrogale penicillata genome
Directory 1.7.5 Name: Chr5/
Directory 1.7.5 Description: directory of sequence alignments in phylip format which were mapped to chromosome 5 of the Petrogale penicillata genome
Directory 1.7.6 Name: Chr6/
Directory 1.7.6 Description: directory of sequence alignments in phylip format which were mapped to chromosome 6 of the Petrogale penicillata genome
Directory 1.7.7 Name: Chr7/
Directory 1.7.7 Description: directory of sequence alignments in phylip format which were mapped to chromosome 7 of the Petrogale penicillata genome
Directory 1.7.8 Name: Chr8/
Directory 1.7.8 Description: directory of sequence alignments in phylip format which were mapped to chromosome 8 of the Petrogale penicillata genome
Directory 1.7.9 Name: Chr9/
Directory 1.7.9 Description: directory of sequence alignments in phylip format which were mapped to chromosome 9 of the Petrogale penicillata genome
Directory 1.7.10 Name: Chr10/
Directory 1.7.10 Description: directory of sequence alignments in phylip format which were mapped to chromosome 10 of the Petrogale penicillata genome
Directory 1.7.11 Name: ChrX/
Directory 1.7.11 Description: directory of sequence alignments in phylip format which were mapped to the X chromosome of the Petrogale penicillata genome
Directory 1.7.12 Name: NR_Chr/
Directory 1.7.12 Description: a concatenated sequence alignment in phylip format of exons on non-rearranged chromosomes
Directory 1.7.13 Name: R_Chr/
Directory 1.7.13 Description: a concatenated sequence alignment in phylip format of exons on rearranged chromosomes
Directory 1.7.14 Name: Summary_outputs/
Directory 1.7.14 Description: directory of R console outputs from PopGenome analysis of Dxy in R for each chromosome as well as an excel file summarising the average Dxy between OTUs
File 1.7.14.1 Name: OTU_Dxy_RvNR.xlsx
File 1.7.14.1 Description: an Excel file summarising the average Dxy between OTUs for rearranged (R) and non-rearranged (NR) loci. The OTUs are abbreviated as follows: wGR (wilkinsi GROOTE), w (wilkinsi Top End), wGU (wilkinsi GULF), bS (burbidgei South), bN (burbidgei North), cc (concinna canescens), cm (concinna monastria), bbWK (brachyotis brachyotis West Kimberley), bbEK (brachyotis brachyotis East Kimberley), and bv (brachyotis victoriae).
Directory 1.7.14.2 Name: Outputs_PopGenome/
Directory 1.7.14.2 Description: directory of R console outputs from PopGenome analysis of Dxy for each chromosome between OTUs
Directory 1.8 Name: SVDquartets/
Directory 1.8 Description: directory containing data files and code for SVDquartet analyses to estimate a species tree and divergence times in PAUP
File 1.8.1 Name: brach_wOG_concat_ambig_SVD.nex
File 1.8.1 Description: input file used for SVDquartets analyses of Petrogale brachoytis group with sequences in nexus format
File 1.8.2 Name: brach_SVD_OTUs.tre
File 1.8.2 Description: output file from SVDquartets species tree analysis of Petrogale brachoytis group, also used as input for qage divergence analyses in PAUP
File 1.8.3 Name: code_svdquartets_qage.txt
File 1.8.3 Description: a text file showing the results from SVDquartets species tree analysis in coalescent units and some equations used to convert this to real divergence dates in years for Petrogale brachoytis group
File 1.8.4 Name: divergence_dating_qage_brach.xlsx
File 1.8.4 Description: an excel file showing the divergence outputs from SVDquartets qage analysis and the conversion to time in years based on a mutation rate and generation time
File 1.9 Name: brachyotis_individual_He.csv
File 1.9 Description: CSV file containing the individual heterozygosity estimates from nuclear exon data for all individuals in brachyotis study
-------------------------------------
2. design -----------------------
Information on the sequence capture kit
# Transcriptome sequence of Petrogale xanthous used for target identification is available in Dryad (doi: 10.5061/dryad.5606t)
File 1 Name: targetExons.fa
File 1 Description: the list of target exons used for probe design
-------------------------------------
3. workflow --------------------
# folder of perl scripts used to clean raw sequence data and assembling
# External dependencies including bowtie2, samtools and GATK
Directory 3.1 Name: clean/--
Directory 3.1 Description: script used to clean reads (adaptor/duplicate/contamination removal), trimming
File 1 Name: scrubReads.pl
File 1 Description: copy of script used for cleaning reads, see also https://github.com/MVZSEQ, SCPP directory
Directory 3.2 Name: assembly/
Directory 3.2 Description: scripts used to assemble exon capture data using a series of Perl scripts that are linked with shell wrapper scripts. It accepts three main datasets: target exon sequences, the reference proteome used during annotation, and captured sequence reads from each sample. Pipeline details can be found at https://github.com/jasongbragg/exon-capture-phylo/
Directory 3.2.1 Name: pl/
Directory 3.2.1 Description: perl scripts for assembly workflow
Directory 3.2.2 Name: sh/
Directory 3.2.2 Description: shell scripts for assembly workflow
File 1 Name: exon_capture_phylo.sh
File 1 Description: main script for assembly workflow, run the exon_capture_phylo.sh script to run the assembly and bash scripts in the sh/ and pl/ directory (3.2.1 and 3.2.2)
File 2 Name: rw.all.config
File 2 Description: example configuration script
Directory 3.3 Name: GATK_docker_info/
Directory 3.3 Description: folder containing information about the GATK docker # this docker is available at https://hub.docker.com/r/trust1/gatk
File 1 Name: Overview_GATK_docker.txt
File 1 Description: text file outlining the GATK docker, instructions and background
File 2 Name: callingpipe_GATK.pl
File 2 Description: perl script used to run GATK from the docker
Directory 3.4 Name: SNP_filtering/
Directory 3.4 Description: folder containing code and readme files for filtering SNP dataset
File 1 Name: process.vcf.1.sh
File 1 Description: shell script used to filter snps, it does this by calling vcftools. It also contains calls to scripts for comparing genotypes of technical replicates, largely using R code from /r file
Directory 3.4.1 Name: r/
Directory 3.4.1 Description: contains r code for opening vcf files, and comparing technical replicates
#Scripts in the workflow directory are provided here primarily for archival purposes. For updated versions, see:
https://github.com/jasongbragg/exon-capture-phylo
https://github.com/CGRL-QB3-UCBerkeley/denovoTargetCapturePhylogenomics
https://github.com/MozesBlom/EAPhy
4. Supplementary Materials --------------------
# folder of supplementary figures, tables and materials and methods used in this paper
Directory 4.1 Name: Supplementary materials and methods/ --
Directory 4.1 Description: Directory for word file of supplementary materials and methods
4.1.1 Name: Supplementary Materials and Methods
4.1.1 Description: word file outlining the detailed materials and methods to go together with the main text of the paper, including the figure and table legends of supplementary material.
Directory 4.2 Name: SuppFigs/ --
Directory 4.2 Description: Directory for pdf of supplementary figures outlined in the paper
4.2.1 Name: Supp_Figs_1-7
4.2.1 Description: pdf file of supplementary figures of the paper in order from one to seven
Directory 4.3 Name: SuppTables/ --
Directory 4.3 Description: Directory for supplementary tables outlined in the paper in comma-separated (CSV) format
4.3.1 Name: Supp_Table1
4.3.1 Description: CSV file of supplementary table 1 of the paper outlining sample information including sample id, OTU and locations and latitude and longitude, and the year the sample was collected. The short read archive (SRA) number is also given to obtain raw sequencing data from NCI.
4.3.2 Name: Supp_Table2
4.3.2 Description: CSV file of supplementary table 2 of the paper outlining the missing data for each mitochondrial gene and the total across the 13 loci analysed
4.3.3 Name: Supp_Table3
4.3.3 Description: CSV file of supplementary table 3 of the paper outlining the divergence dates between lineages
4.3.4 Name: Supp_Table4
4.3.3 Description: CSV file of supplementary table 4 of the paper outlining the statistics of the mitochondrial tree comparisons
4.3.5 Name: Supp_Table5
4.3.3 Description: CSV file of supplementary table 5 of the paper outlining the theta and migration values from the MIGRATE analysis
4.3.6 Name: Supp_Table6
4.3.3 Description: CSV file of supplementary table 6 of the paper outlining the Tajima D statistics for each lineage
This dataset includes ~1000 nuclear loci from a custom-designed exon capture approach (SeqCap EZ Developer Library; Roche NimbleGen), designed from a yellow-footed rock-wallaby (Petrogale xanothpus) transcriptome. The raw sequencing data were processed following the workflow of Singhal (2013), and scripts for this pipeline are available from 10.5061/dryad.7c99f (see Supplementary Materials and Methods here for more detail). The dataset includes samples from both modern and historical specimens, and these were processed in different pipelines. Modern samples were assembled de novo from the cleaned sequencing reads following the workflow described in Bragg et al. (2016). The historical samples were assembled using custom scripts https://github.com/CGRL-QB3-UCBerkeley/denovoTargetCapturePhylogenomics) and published methods (Bi et al. 2012; Portik et al. 2016; refer to Supplementary Materials and Methods here for more detail). Data was finally aligned and filtered using the Eaphy (v1.2; Blom 2015) pipeline.
Mitogenomes were assembled following the methods of Hahn et al. (2013) using a docker version for MITObim (https://github.com/chrishah/MITObim). Data were generated in one of two approaches: (1) mapped to the Osphranter robustus mitochondrial genome (Janke et al. 1997), or (2) reconstructed from mitochondrial ND2, COI and Cytb seeds from previously sequenced individuals (Potter et al. 2012, 2014). Final mitochondrial genome assemblies were then aligned using Geneious Prime as well as mafft (Katoh et al. 2002) and edited to protein-coding regions of the genome (~11,500 bp).
The files in this repository are either text, Word, Excel or PDF. Most software programs can be used to open or view these files. The sequence alignments (e.g., .phy or .fasta) can be opened in text documents.
- Potter, Sally; Moritz, Craig; Piggott, Maxine P et al. (2024). Museum Skins Enable Identification of Introgression Associated with Cytonuclear Discordance. Systematic Biology. https://doi.org/10.1093/sysbio/syae016
