RNA-sequencing dataset of land snails collected in Australia
Data files
May 28, 2025 version files 120.29 GB
-
Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_assembly_transcripts.fasta
344.84 MB
-
Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R1.fastq.gz
4.13 GB
-
Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R2.fastq.gz
4.10 GB
-
Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_assembly_transcripts.fasta
328.68 MB
-
Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R1.fastq.gz
5.98 GB
-
Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R2.fastq.gz
5.84 GB
-
Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_assembly_transcripts.fasta
434.89 MB
-
Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R1.fastq.gz
6.55 GB
-
Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R2.fastq.gz
6.36 GB
-
Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_assembly_transcripts.fasta
445.91 MB
-
Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R1.fastq.gz
7.19 GB
-
Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R2.fastq.gz
7.01 GB
-
Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_assembly_transcripts.fasta
478.36 MB
-
Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R1.fastq.gz
7.40 GB
-
Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R2.fastq.gz
7.33 GB
-
Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_assembly_transcripts.fasta
438.26 MB
-
Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R1.fastq.gz
6.18 GB
-
Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R2.fastq.gz
6.21 GB
-
Library36_L001_L002_L003_L004_L005_L006_L007_L008_R1.fastq.gz
5.51 GB
-
Library36_L001_L002_L003_L004_L005_L006_L007_L008_R2.fastq.gz
5.45 GB
-
Library36_L001_L002_L003_L004_L005_L006_L007_L008_transcripts.fasta
344.73 MB
-
Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_assembly_transcripts.fasta
475.96 MB
-
Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R1.fastq.gz
4.58 GB
-
Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R2.fastq.gz
4.63 GB
-
Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_assembly_transcripts.fasta
424.20 MB
-
Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R1.fastq.gz
4.84 GB
-
Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R2.fastq.gz
4.82 GB
-
Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_assembly_transcripts.fasta
336.52 MB
-
Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R1.fastq.gz
6.15 GB
-
Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R2.fastq.gz
5.98 GB
-
README.md
15.42 KB
Abstract
RNA-seq data of snail specimens (n=10) collected in Cairns QLD, Darwin NT and Adelaide SA, targeting three main ports where introduced snails could have come in. RNA extractions were performed on all snail specimens using the modified protocol of Maxwell ® RSC simplyRNA Tissue Kit (Promega A1340) kit and the Maxwell ® RSC Instrument.
https://doi.org/10.5061/dryad.xgxd254s0
Description of the data and file structure
Principal Investigator Contact Information
Name: Dr. Valerie Caron
Institution: CSIRO
Email: valerie.caron.enviro@gmail.com
Name: Mariana Hopper
Institution: CSIRO
Email: mariana.hopper@csiro.au
Alternate Contact Information
Name: Berenice Talamantes Becerra
Institution: CSIRO
Email: Berenice.TalamantesBecerra@csiro.au
Dataset Overview
This dataset comprises RNA-Seq data collected from land snails in Australia.
The primary objective of this study was to analyse the transcriptome profiles of invasive land snails to investigate their potential role as vectors. Live snails were collected from strategic locations in the Australian port cities of Darwin, Cairns, and Adelaide, targeting three key ports where introduced snails may have arrived.
Species identification was confirmed by Richard Willan from the Museum and Art Gallery of the Northern Territory. Snails were kept alive until they were stored at -80ºC for preservation.
The dataset includes:
- Raw RNA-Seq data in FASTQ format from whole snails collected in Australian port cities during 2024.
- Assembled transcripts for each library, which can be used for further downstream analyses.
The file names, snail species, and sampling locations are detailed below.
| File name | State | Latitude | Longitude | Species Collected | Native or Introduced |
|---|---|---|---|---|---|
| Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R1.fastq.gz | QLD | -16.902124 | 145.71036 | Leptopoma perlucida | Native |
| Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R2.fastq.gz | QLD | -16.902124 | 145.71036 | Leptopoma perlucida | Native |
| Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_assembly_transcripts.fasta | QLD | -16.902124 | 145.71036 | Leptopoma perlucida | Native |
| Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R1.fastq.gz | QLD | -16.881993 | 145.710589 | Paropeas achatinaceum | Introduced |
| Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R2.fastq.gz | QLD | -16.881993 | 145.710589 | Paropeas achatinaceum | Introduced |
| Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_assembly_transcripts.fasta | QLD | -16.881993 | 145.710589 | Paropeas achatinaceum | Introduced |
| Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R1.fastq.gz | QLD | -16.902124 | 145.71036 | Coneuplecta pampini | Native |
| Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R2.fastq.gz | QLD | -16.902124 | 145.71036 | Coneuplecta pampini | Native |
| Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_assembly_transcripts.fasta | QLD | -16.902124 | 145.71036 | Coneuplecta pampini | Native |
| Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R1.fastq.gz | SA | -34.926163 | 138.495329 | Theba pisana | Introduced |
| Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R2.fastq.gz | SA | -34.926163 | 138.495329 | Theba pisana | Introduced |
| Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_assembly_transcripts.fasta | SA | -34.926163 | 138.495329 | Theba pisana | Introduced |
| Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R1.fastq.gz | SA | -34.926163 | 138.495329 | Cochlicella acuta | Introduced |
| Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R2.fastq.gz | SA | -34.926163 | 138.495329 | Cochlicella acuta | Introduced |
| Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_assembly_transcripts.fasta | SA | -34.926163 | 138.495329 | Cochlicella acuta | Introduced |
| Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R1.fastq.gz | SA | -34.780746 | 138.481011 | Cernuella virgata | Introduced |
| Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R2.fastq.gz | SA | -34.780746 | 138.481011 | Cernuella virgata | Introduced |
| Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_assembly_transcripts.fasta | SA | -34.780746 | 138.481011 | Cernuella virgata | Introduced |
| Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R1.fastq.gz | NT | -12.432255 | 130.935729 | Allopeas gracile | Introduced |
| Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R2.fastq.gz | NT | -12.432255 | 130.935729 | Allopeas gracile | Introduced |
| Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_assembly_transcripts.fasta | NT | -12.432255 | 130.935729 | Allopeas gracile | Introduced |
| Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R1.fastq.gz | NT | -12.470201 | 130.991047 | Allopeas gracile | Introduced |
| Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R2.fastq.gz | NT | -12.470201 | 130.991047 | Allopeas gracile | Introduced |
| Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_assembly_transcripts.fasta | NT | -12.470201 | 130.991047 | Allopeas gracile | Introduced |
| Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R1.fastq.gz | NT | -12.526667 | 131.153172 | Gulella bicolor | Introduced |
| Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R2.fastq.gz | NT | -12.526667 | 131.153172 | Gulella bicolor | Introduced |
| Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_assembly_transcripts.fasta | NT | -12.526667 | 131.153172 | Gulella bicolor | Introduced |
| Library36_L001_L002_L003_L004_L005_L006_L007_L008_R1.fastq.gz | SA | -34.899758 | 138.48827 | Cochlicella sp. | Introduced |
| Library36_L001_L002_L003_L004_L005_L006_L007_L008_R2.fastq.gz | SA | -34.899758 | 138.48827 | Cochlicella sp. | Introduced |
| Library36_L001_L002_L003_L004_L005_L006_L007_L008_transcripts.fasta | SA | -34.899758 | 138.48827 | Cochlicella sp. | Introduced |
Funding
This project was funded by the Department of Agriculture, Fisheries and Forestry, Australian Government
Ethics Approval
Not applicable
Raw Data Files
RNA sequencing was conducted by the Australian Genome Research Facility Ltd. (AGRF Ltd.) using the Illumina NovaSeq X platform.
A 300-cycle paired-end run configuration was used for Libraries 1 to 20, while Library 36 was sequenced with a 150-cycle paired-end configuration.
Library preparation was performed using the Stranded Total RNA with Ribo-Zero Plus Library Prep Kit.
On average, sequencing generated at least 56 million reads per sample.
Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R1.fastq.gz
Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R2.fastq.gz
Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R1.fastq.gz
Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R2.fastq.gz
Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R1.fastq.gz
Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R2.fastq.gz
Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R1.fastq.gz
Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R2.fastq.gz
Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R1.fastq.gz
Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R2.fastq.gz
Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R1.fastq.gz
Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R2.fastq.gz
Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R1.fastq.gz
Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R2.fastq.gz
Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R1.fastq.gz
Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R2.fastq.gz
Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R1.fastq.gz
Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R2.fastq.gz
Library36_L001_L002_L003_L004_L005_L006_L007_L008_R1.fastq.gz
Library36_L001_L002_L003_L004_L005_L006_L007_L008_R2.fastq.gz
All Files and Variables
File: Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R1.fastq.gz
Description: RNA data of Leptopoma perlucida R1 for metagenomic analyses
File: Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_R2.fastq.gz
Description: RNA data of Leptopoma perlucida R2 for metagenomic analyses
File: Library1_22NT3KLT3_CCACGCTGAA-TATTCCTCAG_L004_assembly_transcripts.fasta
Description: Assembled transcripts of Leptopoma perlucida
File: Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R1.fastq.gz
Description: RNA data of Paropeas achatinaceum R1 for metagenomic analyses
File: Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_R2.fastq.gz
Description: RNA data of Paropeas achatinaceum R2 for metagenomic analyses
File: Library3_22NT3KLT3_ATGTCGTATT-TTCTTGCTGG_L004_assembly_transcripts.fasta
Description: Assembled transcripts of Paropeas achatinaceum
File: Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R1.fastq.gz
Description: RNA data of Coneuplecta pampini R1 for metagenomic analyses
File: Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_R2.fastq.gz
Description: RNA data of Coneuplecta pampini R2 for metagenomic analyses
File: Library4_22NT3KLT3_GCAATATTCA-GGCGCCAATT_L004_assembly_transcripts.fasta
Description: Assembled transcripts of Coneuplecta pampini
File: Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R1.fastq.gz
Description: RNA data of Theba pisana R1 for metagenomic analyses
File: Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_R2.fastq.gz
Description: RNA data of Theba pisana R2 for metagenomic analyses
File: Library6_22NT3KLT3_CTAGATTGCG-AGATATGGCG_L004_assembly_transcripts.fasta
Description: Assembled transcripts Theba pisana
File: Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R1.fastq.gz
Description: RNA data of Cochlicella acuta R1 for metagenomic analyses
File: Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_R2.fastq.gz
Description: RNA data of Cochlicella acuta R2 for metagenomic analyses
File: Library7_22NT3KLT3_CGATGCGGTT-CCTGCTTGGT_L004_assembly_transcripts.fasta
Description: Assembled transcripts Cochlicella acuta
File: Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R1.fastq.gz
Description: RNA data of Cernuella virgata R1 for metagenomic analyses
File: Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_R2.fastq.gz
Description: RNA data of Cernuella virgata R2 for metagenomic analyses
File: Library10_22NT3KLT3_AATTCCATCT-CTTCAGTTAC_L004_assembly_transcripts.fasta
Description: Assembled transcripts Cernuella virgata
File: Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R1.fastq.gz
Description: RNA data of Allopeas gracile R1 for metagenomic analyses
File: Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_R2.fastq.gz
Description: RNA data of Allopeas gracileR2 for metagenomic analyses
File: Library18_22NT3KLT3_TTAACCTTCG-AGGCCAGACA_L004_assembly_transcripts.fasta
Description: Assembled transcripts Allopeas gracile
File: Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R1.fastq.gz
Description: RNA data of Allopeas gracileR1 for metagenomic analyses
File: Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_L004_R2.fastq.gz
Description: RNA data of Allopeas gracile R2 for metagenomic analyses
File: Library19_22NT3KLT3_CATATGCGAT-CCTTGAACGG_assembly_transcripts.fasta
Description: Assembled transcripts Allopeas gracile
File: Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R1.fastq.gz
Description: RNA data of Gulella bicolor R1 for metagenomic analyses
File: Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_R2.fastq.gz
Description: RNA data of Gulella bicolor R2 for metagenomic analyses
File: Library20_22NT3KLT3_AGCCTATGAT-CACCACCTAC_L004_assembly_transcripts.fasta
Description: Assembled transcripts Gulella bicolor
File: Library36_L001_L002_L003_L004_L005_L006_L007_L008_R1.fastq.gz
Description: RNA data of Cochlicella sp. R1 for metagenomic analyses
File: Library36_L001_L002_L003_L004_L005_L006_L007_L008_R2.fastq.gz
Description: RNA data of Cochlicella sp. R2 for metagenomic analyses
File: Library36_L001_L002_L003_L004_L005_L006_L007_L008_transcripts.fasta
Description: Assembled transcripts Cochlicella sp.
Code/software
The sequencing provider delivered the data as raw FASTQ files, which are being uploaded in their original format.
Tools Used for Transcriptome Assembly
1. Quality Control (QC) and Adapter Removal
QC and adapter trimming were performed using Trim Galore! (v0.6.4) with default parameters:
🔗 Trim Galore! GitHub
Command:
trim_galore --paired --fastqc --output_dir path/to/output path/R1 path/R2
2. RNA-Seq Assembly
Transcriptome assembly was performed using SPAdes (v4.0.0) with default parameters for RNA-Seq:
🔗 SPAdes GitHub
Command:
spades.py --tmp-dir path/tmp --rna -1 path/R1 -2 path/R2 -o path/output
Users can employ other bioinformatics tools to visualize the data, using Illumina paired-end reads as input.
Access information
The dataset includes RNA sequencing results from terrestrial snails collected across Australia. The number of samples and species collected was limited to sampling sites and species found. This dataset comprises genetic information derived from native and introduced Australian biodiversity. Users are advised that the data is provided for non-commercial research use. Any commercial use or bioprospecting of the data requires prior informed consent and mutually agreed terms with CSIRO.
Other publicly accessible locations of the data:
- Not applicable
Data was derived from the following sources:
- Not applicable
RNA-Seq data were generated from snail specimens (n = 10) collected in Cairns (QLD), Darwin (NT), and Adelaide (SA), targeting three major ports where introduced snails may have entered. RNA extractions were performed using the Maxwell® RSC simplyRNA Tissue Kit (Promega A1340) and the Maxwell® RSC Instrument. Sequencing was conducted at the Australian Genome Research Facility Ltd. (AGRF Ltd.) using the Illumina NovaSeq X platform. A 300-cycle run configuration was used for all samples, except Library36, which was sequenced with a 150-cycle run configuration. Library preparation was performed using the Stranded Total RNA with Ribo-Zero Plus Library Prep Kit. Raw sequencing reads are included in this dataset.
