Skip to main content
Dryad

Genetic legacies of mega-landslides: Cycles of isolation and contact across flank collapses in an oceanic island

Cite this dataset

Noguerales, Víctor et al. (2024). Genetic legacies of mega-landslides: Cycles of isolation and contact across flank collapses in an oceanic island [Dataset]. Dryad. https://doi.org/10.5061/dryad.0cfxpnw90

Abstract

Catastrophic flank collapses are recognised as important drivers of insular biodiversity dynamics, through the disruption of species ranges and subsequent allopatric divergence. However, little empirical data supports this conjecture, with their evolutionary consequences remaining poorly understood. Using genome-wide data within a population genomics and phylogenomics framework, we evaluate how mega-landslides have impacted evolutionary and demographic history within a species complex of weevils (Curculionidae) within the Canary Island of Tenerife. We reveal a complex genomic landscape, within which individuals of single ancestry were sampled in areas characterised by long-term geological stability, relative to the timing of flank collapses. By contrast, individuals of admixed ancestry were almost exclusively sampled within the boundaries of flank collapses. Estimated divergence times among ancestral populations aligned with the timings of mega-landslide events. Our results provide the first evidence for a cyclical dynamic of range fragmentation and secondary contact across flank collapse landscapes, with support for a model where this dynamic is mediated by Quaternary climate oscillations. The context within which we reveal climate and topography to interact cyclically through time to shape the geographic structure of genetic variation, together with related recent work, highlights the importance of topoclimatic phenomena as an agent of diversification within insular invertebrates.

README

This README file was generated on 2024-03-06 by Víctor Noguerales.

GENERAL INFORMATION

1. Title of Dataset: Genetic legacies of mega-landslides: Cycles of isolation and contact across flank collapses in an oceanic island

2. Author Information

Authors and institution for correspondence:

Instituto de Productos Naturales y Agrobiología (IPNA-CSIC), San Cristóbal de La Laguna, Canary Islands, Spain

Víctor Noguerales, email: victor.noguerales@csic.es, https://orcid.org/0000-0003-3185-778X

Brent C. Emerson, email: bemerson@ipna.csic.es, https://orcid.org/0000-0003-4067-9858

3. Date of data collection (single date, range, approximate date): 2014-2017

4. Geographic location of data collection: Tenerife and Gran Canaria, Canary Islands, Spain

5. Information about funding sources that supported the collection of the data:  This work was supported by the Ministry of Economy and Competitiveness (MINECO) through grants CGL2013‐42589‐P, CGL2017‐85718‐P and PID2020-116788GB-I00, co‐financed by FEDER.

 

SHARING/ACCESS INFORMATION

1. Licenses/restrictions placed on the data: CC0 1.0 Universal (CC0 1.0) Public Domain

2. Links to publications that cite or use the data: Noguerales, V., Arjona, Y., García-Olivares, V., Machado, A., López, H., Patiño, J.  & B.C. Emerson (2024). Genetic legacies of mega-landslides: Cycles of isolation and contact across flank collapses in an oceanic island. Molecular Ecology.

3. Links to other publicly accessible locations of the data: None

4. Links/relationships to ancillary data sets: None

5. Was data derived from another source? No

                  A. If yes, list source(s): NA

6. Recommended citation for this dataset:  Noguerales, V., Arjona, Y., García-Olivares, V., Machado, A., López, H., Patiño, J.  & B.C. Emerson (2024). Data from: Genetic legacies of mega-landslides: Cycles of isolation and contact across flank collapses in an oceanic island. Dryad Digital Repository. https://doi.org/10.5061/dryad.0cfxpnw90

 

DATA & FILE OVERVIEW

1. File List:

1) 01_Structure_PCA.zip

2) 02_Nei_Ho.zip

3) 03_RAXML.zip

4) 04_SVDquartets.zip

5) 05_SNAPP.zip

6) 06_BPP.zip

7) 07_Stairwayplot.zip

8) 08_Fastsimcoal2.zip

 

2. Relationship between files, if important: None

3. Additional related data collected that was not included in the current data package: None

4. Are there multiple versions of the dataset? No

                  A. If yes, name of file(s) that was updated: NA

                                   i. Why was the file updated? NA

                                   ii. When was the file updated? NA

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 01_Structure_PCA.zip

 The ZIP folder "01_Structure_PCA.zip" contains two files in STRUCTURE format (.str) used for running both Structure and PCA.

 File “01a_Structure_PCA_allindividuals.str” contains the full dataset of 126 individuals across Tenerife, including 6018 unlinked SNPs. Individuals are coded in rows (2 rows per individual), with each column representing an unlinked SNP. Missing data is coded as “-9”.

 File “01b_Structure_onlyAnaga.str” contains those 11 individuals sampled in the Anaga peninsula, including 8758 unlinked SNPs. Individuals are coded in rows (2 rows per individual), with each column representing an unlinked SNP. Missing data is coded as “-9”.

 

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 02_Nei_Ho.zip

 The ZIP folder "02_Nei_Ho.zip" contains the input file in VCF format required for estimating relatedness, heterozygosity and genetic distances for the full dataset of 126 individuals across Tenerife. This input file is used to estimate relatedness, observed heterozygosity in VCFTOOLS, and Nei’s genetic distances as implemented in the R package StAMPP, as detailed in Material and Methods in the manuscript.

 

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 03_RAXML.zip

 The ZIP folder "03_RAXML.zip" contains a file in PHYLIP format including the full dataset of 126 individuals across Tenerife and the 5 individuals from Gran Canaria considered as outgroups. This input file is used to reconstruct phylogenetic relationships among individuals in RAXML. Individuals are coded in rows (1 row per individual), with each subsequent column representing a SNP. Missing data is coded as “N”.

 

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 04_SVDquartets.zip

The ZIP folder "04_SVDquartets.zip" contains a file in NEXUS format including the subset of 36 individuals from Tenerife, and the 5 individuals from Gran Canaria considered as outgroups, for a total of 10986 unlinked SNPs. This input file is used to reconstruct phylogenetic relationships among genetic groups in SVDQUARTETS. Individuals are coded in rows (1 row per individual), with each subsequent column representing an unlinked SNP. Missing data is coded as “N”.

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 05_SNAPP.zip

The ZIP folder "05_SNAPP.zip" contains a file in NEXUS format including the subset of 36 individuals from Tenerife, for a total of 6018 unlinked SNPs. This input file is used to reconstruct phylogenetic relationships among genetic groups in SNAPP. Individuals are coded in rows (1 row per individual), with each subsequent column representing an unlinked SNP. Missing data is coded as “?”.

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 06_BPP.zip

 The ZIP folder "06_BPP.zip" contains four sequence data files in BPP format including the subset of 36 individuals from Tenerife to estimate the timing of divergence among genetic groups in BPP. Each sequence file is composed of 5000 randomly chosen sequences. Sample codes are detailed in the file “IMap_file.txt”. Parameters used in BPP are described in the file “A00.ctl”.

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 07_Stairwayplot.zip

The ZIP folder "07_Stairwayplot.zip" contains blueprint files, one per genetic group, to estimate changes in effective population size (Ne) over time in STAIRWAYPLOT2. These blueprint files contain all the information required to run STAIRWAYPLOT2, including parameters used and SFS information for each genetic group.

#############################################################################

 

DATA-SPECIFIC INFORMATION FOR: 08_Fastsimcoal2.zip

 The ZIP folder "08_Fastsimcoal2.zip" contains the input files for demographic analyses in FASTSIMCOAL2. For each of the alternative models, three different files are provided. The .est files contain the information for model specification in terms of migration matrices and historical events. The .tpl files contains priors and rules information for each of the parameters specified in the respective .est file. Finally, the SFS is contained in the respective “MSFS.obs” file.

#############################################################################

 

Methods

Representative geographical sampling from the Tenerife species of the L. tessellatus complex was achieved by complementing previous sampling from Faria et al. (2016) and García-Olivares et al. (2017) with 74 specimens from 61 new localities. This additional sampling effort gave rise to a total of 126 individuals from 102 sites in Tenerife (Table S1). We also included 5 individuals from the L. tessellatus complex, belonging to a sister clade from the nearby island of Gran Canaria (García-Olivares et al., 2019), as an outgroup.

We extracted DNA using the Qiagen DNeasy Blood & Tissue kit following the manufacturer’s instructions. DNA was processed using the double-digestion restriction-site associated DNA sequencing protocol (ddRADseq, Peterson et al., 2012) as described in Mastretta-Yanes et al. (2015) and García-Olivares et al. (2019). In brief, DNA was digested with the restriction enzymes MseI and EcoRI (New England Biolabs, Ipswich, MA, USA). Genomic libraries were pooled at equimolar ratios and size selected for fragments between 200-250 base pairs (bp) and, then, sequenced in a single-end 100-bp lane on an Illumina HiSeq2500 platform (Lausanne Genomic Technologies Facility, University of Lausanne, Switzerland).

We first used FASTQC version 0.11.7 (Andrews, 2010) to quality check raw reads. Then, raw sequences were demultiplexed, quality filtered, and de novo assembled using IPYRAD version 0.9.81 (Eaton & Overcast, 2020). Only reads with unambiguous barcodes were retained (max_barcode_mismatch) and a stricter filter was applied to remove Illumina adapter contamination (filter_adapters). We converted base calls with a Phred score <20 into Ns and discarded reads with >5 Ns (max_low_qual_bases). Afterward, we clustered the retained reads within- and across samples considering a threshold of sequence similarity of 85% (clust_threshold) and discarded those clusters with a minimum coverage depth of less than 5 (mindepth_majrule). Resulting in loci shorter than 35 bp (filter_min_trim_len), containing one or more heterozygous sites across more than 50% of individuals (max_shared_Hs_locus), and showing more than 20% polymorphic sites (max_SNPs_locus) were discarded. In a final filtering step, we only retained loci that were present in at least 80% of the samples (min_samples_locus), which yielded a total of 4987 and 6018 unlinked SNPs, when including and excluding the outgroup, respectively. Optimal parameter tuning in ipyrad is performed following results from the sensitivity analyses conducted by García-Olivares et al. (2019) for the Laparocerus tessellatus species complex.

Funding

Ministry of Economy, Industry and Competitiveness, Award: CGL2013‐42589‐P

Ministry of Economy, Industry and Competitiveness, Award: CGL2017‐85718‐P

Ministry of Economy, Industry and Competitiveness, Award: PID2020-116788GB-I00

FEDER