Skip to main content
Dryad

Data from: Past volcanic activity predisposes an endemic threatened seabird to negative anthropogenic impacts

Cite this dataset

Teixeira, Helena et al. (2024). Data from: Past volcanic activity predisposes an endemic threatened seabird to negative anthropogenic impacts [Dataset]. Dryad. https://doi.org/10.5061/dryad.s1rn8pkfb

Abstract

Humans are regularly cited as the main driver of current biodiversity extinction, but the impact of historic volcanic activity is often overlooked. Pre-human evidence of wildlife abundance and diversity are essential for disentangling anthropogenic impacts from natural events. Réunion Island, with its intense and well-documented volcanic activity, endemic biodiversity, long history of isolation and recent human colonization, provides an opportunity to disentangle these processes. We track past demographic changes of a critically endangered seabird, the Mascarene petrel Pseudobulweria aterrima, using genome-wide SNPs. Coalescent modeling suggested that a large ancestral population underwent a substantial population decline in two distinct phases, ca. 125,000 and 37,000 years ago, coinciding with periods of major eruptions of Piton des Neiges. Subsequently, the ancestral population was fragmented into the two known colonies, ca. 1,500 years ago, following eruptions of Piton de la Fournaise. In the last century, both colonies declined significantly due to anthropogenic activities, and although the species was initially considered extinct, it was rediscovered in the 1970s. Our findings suggest that the current conservation status of wildlife on volcanic islands should be firstly assessed as a legacy of historic volcanic activity, and thereafter by the increasing anthropogenic impacts, which may ultimately drive species towards extinction.

README

This README file was generated by Helena Teixeira.

GENERAL INFORMATION

  1. Title of Dataset: Past volcanic activity predisposes an endemic threatened seabird to negative anthropogenic impacts

  2. Corresponding author Information:

Helena Teixeira
UMR ENTROPIE - Écologie marine Tropicale des Océans Pacifique et Indien
Faculté des Sciences et Technologies
Université de la Réunion
15 Av. René Cassin CS 92003
97744 Saint Denis Cédex 9
La Réunion, France
Email: helena-marisa.osorio-teixeira@univ-Reunion.fr

  1. Date of data collection: 2008 - 2018

  2. Geographic location of data collection: Réunion Island, France

  3. Information about funding sources that supported the collection of the data, data analyses, and computing and storage resources: Toulouse Occitanie (Bioinfo Genotoul, doi: 10.15454/1.5572369328961167E12), FEDER Smac (2020-2022, N° RE0022954), LIFE + Petrels (grant number: LIFE13 BIO/FR/000075), Bertarelli Foundation as part of the Bertarelli Programme in Marine Science (Project 822916)

  4. Recommended citation for this dataset:
    Teixeira, H. et al. Past volcanic activity predisposes an endemic threatened seabird to negative anthropogenic impacts. Dryad Digital Repository. https://doi.org/10.5061/dryad.s1rn8pkfb

DATA & FILE OVERVIEW

Pseudobulweria genotypes

Pseudobulweria aterrima raw genotypes file generated by Diversity Arrays Technology (DarT Pty Ltd, Canberra) using the DArTseq protocol.
DNA sequences were aligned to the Calonectris_borealis_v1 reference genome.

Report_DPsbu20-5021_SNP_mapping_1.csv

Pseudobulweria metadata

Specimen metadata to accompany the raw genotypes data file when read into a genlight object using the function gl.read.dart in R package dartR.
Pseudobulweria_metadata.csv

Abbreviations used:
pop = population; RIR = Rivière des Remparts; RDC = Rond Des Chevrons; GB = light-grounded birds; CA = artificial colony; repro_season = ear of reproductive season; M = male; F = female; lat = latitude; lon = longitude.

Genomic datasets

Genomic datasets used for all the downstream analyses (dataset 1 – 4) in the genlight format. For details about each dataset see supplementary Text S1 and Fig. S8 (Teixeira et al. 2024; Scientific Reports).
Pseudobulweria_dataset1.Rdata
Pseudobulweria_dataset2.Rdata
Pseudobulweria_dataset3.Rdata
Pseudobulweria_dataset4.Rdata

fastsimcoal2 files

Template file for fastsimcoal2 describing the best demographic model for the Pseudobulweria aterrima (M7; “ancient & recent bottlenecks”) and respective parameter distributions.
Pseudobulweria_M7.tpl
Pseudobulweria_M7.est

README

Description of files.

#########################################################################

DATA-SPECIFIC INFORMATION FOR: Report_DPsbu20-5021_SNP_mapping_1.csv

  1. File description: SNP 1 Row Mapping Format: "0" = Reference allele homozygote, "1" = SNP allele homozygote, "2"= heterozygote and "-" = double null/null allele homozygote (absence of fragment with SNP in genomic representation)

  2. Variable List:

  • AlleleID: Unique identifier for the sequence in which the SNP marker occurs

  • CloneID: Unique identifier of the sequence tag

  • AlleleSequenceRef/ AlleleSequenceSnp : The sequence of the Reference allele is in the Ref row, the sequence of the SNP allele in the SNP row

  • TrimmedSequenceRef/ TrimmedSequenceSnp: Same as the full sequence, but with removed adapters in short marker tags

  • Chrom_Calonectris_borealis_v1: Contig with the best alignment of marker/tag to the Calonectris_borealis_v1 reference genome.
    Missing data = contigs that were not aligned to the Calonectris_borealis_v1 reference genome (% identity < 70).

  • ChromPosTag_Calonectris_borealis_v1: Position on contig with the best alignment of marker/tag to the Calonectris_borealis_v1 reference genome

  • ChromPosSnp_Calonectris_borealis_v1: Calculated position of the SNP for best aligned marker on a contig to the Calonectris_borealis_v1 reference genome

  • AlnCnt_Calonectris_borealis_v1: Total count of aligning markers/tags with selection criteria described below

  • AlnEvalue_Calonectris_borealis_v1: E value of the best alignment to the Calonectris_borealis_v1 reference genome

  • Strand_Calonectris_borealis_v1: Strand of the marker alignment - Plus for forward and Minus for reverse.
    Missing data = Not available as contigs that were not aligned to the Calonectris_borealis_v1 reference genome.

  • SNP: base position and base variant details

  • SnpPosition: The position in the sequence tag at which the defined SNP variant base occurs

  • CallRate: The proportion of samples for which the genotype call is either "1" or "0"

  • OneRatioRef: The proportion of samples for which the genotype score is "1", in the Reference allele row

  • OneRatioSnp: The proportion of samples for which the genotype score is "1", in the SNP allele row

  • FreqHomRef: The proportion of samples which score as homozygous for the Reference allele

  • FreqHomSnp: The proportion of samples which score as homozygous for the SNP allele

  • FreqHets: The proportion of samples which score as heterozygous

  • PICRef: The polymorphism information content (PIC) for the Reference allele row

  • PICSnp: The polymorphism information content (PIC) for the SNP allele row

  • AvgPIC: The average of the polymorphism information content (PIC) of the Reference and SNP allele rows

  • AvgCountRef: The sum of the tag read counts for all samples, divided by the number of samples with non-zero tag read counts, for the Reference allele row

  • AvgCountSnp: The sum of the tag read counts for all samples, divided by the number of samples with non-zero tag read counts, for the SNP allele row

  • RepAvg: The proportion of technical replicate assay pairs for which the marker score is consistent

Methods

Single Nucleotide Polymorphism (SNP) genotyping was carried out by Diversity Arrays Technology (DarT Pty Ltd, Canberra) using the DArTseq protocol. DArT library was prepared using DNA from 93 birds and the restriction enzymes PstI and SphI.  Loci were aligned to the genome assembly of Calonectris borealis (family Procellariidae; GCA_013401115.1).

The raw DArTseq data (67,095 SNPs) was filtered by the authors using the dartR v 2.1.4 R package (see manuscript for details). Four genomic datasets (dataset 1 – dataset 4) were generated and used for the downstream analyses (See Text S1 and Fig. S8 for details).

Specimen metadata was collated during fieldwork.

Funding

FEDER Smac, Award: 2020-2022, N° RE0022954

LIFE + Petrels , Award: LIFE13 BIO/FR/000075

Fondation Bertarelli, Award: Project 822916