Microbial composition and diversity across invasive and native spiders in the pacific islands
Data files
Sep 25, 2025 version files 1.13 GB
-
filtered_clustered_ASV.csv
1.34 MB
-
raw_sequencing_data.zip
1.13 GB
-
README.md
2.74 KB
-
spider_metadata_16s_only.csv
12.91 KB
Abstract
This dataset comprises microbiome sequencing data collected from co-occurring native and invasive spider species sampled across three distinct locations in the Hawaiian Islands, alongside mainland reference populations. The data include gut microbial community profiles cleaned using QIIME2 and taxonomically assigned to Amplicon Sequence Variants (ASVs) using the SILVA database for annotating bacterial taxonomic groups to sequence information. The final dataset consists of an ASV table, taxonomic table, metadata file, and phylogenetic tree files that were then subjected to downstream processing using phyloseq, where vertically transmitted endosymbionts and environmentally acquired microbes were separately analyzed to observe differences across invasion status. This dataset was used to observe variation in microbial abundance and diversity across invasive and native taxa, supporting further investigations into the role of microbiota in arthropod adaptation to novel environments. Thus, this dataset has potential for reuse in studies of invasion biology, microbial ecology, and host-microbe interactions. Ethical considerations and site-specific sampling metadata are included to facilitate reproducibility and appropriate data reuse.
Authors
First Author: Madison J. Pfau, mapfau@ucsc.edu, University of California, Santa Cruz
Coauthors: See Dryad Submission for full author list
Corresponding Author: Rosemary Gillespie, gillespie@berkeley.edu, University of California, Berkeley
Description
This dataset contains high-throughput 16S rRNA sequencing data from the gut microbiomes of co-occurring native and invasive spider species sampled from three Hawaiian Island sites and mainland reference populations. It supports analyses of microbial diversity and host-microbe interactions related to invasion biology.
Contents
1. raw_sequencing_data.zip/
- Zipped folder containing demultiplexed FASTQ files for all samples (forward and reverse reads), including sequencing blanks
- Low-quality and filtered sequences have been removed
- Sequences were processed using QIIME2 and denoised with DADA2
- Taxonomic assignment was conducted using the SILVA 16S rRNA database (v138)
- ASVs were clustered at 97% similarity and compiled into
filtered_clustered_ASV.csv
2. filtered_clustered_ASV.csv
- Amplicon Sequence Variant (ASV) Count Table
- Rows: ASV IDs (Feature IDs)
- Columns: Sample IDs
- Values: Abundance of each ASV per sample
3. spider_metadata_16S_only.csv
- Sample metadata file
- Columns include:
SampleIDSpecies IdentificationSiteRegion(island/mainland)Invasion_Status(native/invasive)- I7_Index_ID / I5_Index_ID: Barcode identifiers for multiplex sequencing.
- index / index2: Specific nucleotide barcode sequences corresponding to the above index IDs.
- Additional sequencing preparation details
Methods Summary
- DNA was extracted from dissected spider guts
- The 16S rRNA V4 region was amplified and sequenced on the Illumina MiSeq platform (2x250 bp)
- Sequences were demultiplexed and processed using QIIME2
- ASVs were generated with DADA2 and taxonomically assigned via the SILVA database
- Functional predictions were performed using PICRUSt2
- Downstream analyses were conducted in R using the
phyloseqpackage, with comparisons of vertically transmitted endosymbionts and environmentally acquired microbes
File Naming Conventions
- Sample names are coded as:
X_[SampleNumber]_[Region]_[Site]_16S - All taxonomic annotations follow SILVA v138 conventions
Contact
For questions about the data, please contact: Madison J. Pfau, mapfau@ucsc.edu}
