Transcriptomes of six Streptanthoid Complex species
Data files
Mar 30, 2024 version files 2.22 GB
-
C.amplexicaulis_transcriptome.fasta
159.94 MB
-
C.anceps_transcriptome.fasta
173.23 MB
-
C.inflatus_transcriptome.fasta
152.46 MB
-
OrthoFinder_Results.tar.gz
1.32 GB
-
README.md
1.44 KB
-
S.breweri_transcriptome.fasta
130.70 MB
-
S.glandulosus_transcriptome.fasta
142.10 MB
-
S.tortuosus_transcriptome.fasta
144.54 MB
Abstract
To increase the number of genomic resources available for species of Streptanthoid Complex, we created transcriptomes for six jewelflower species. Focal species from six branches were selected to cover a majority of the phylogenetic tree. The goal is for these transcriptome sequences to aide future studies which look to explore the evolution and adaptation of the Streptanthoid Complex as it radiated throughout the California Floristic Province.
README: Transcriptomes of six Streptanthoid Complex species
https://doi.org/10.5061/dryad.t1g1jwt99
Transcript sequences of Caulanthus anceps, Caulanthus amplexicaulis, Caulanthus inflatus, Streptanthus breweri, Streptanthus glandulosus, and Streptanthus tortuosus. Transcripts generated using the isON transcript analysis pipeline. Orthogroup data created using orthofinder.
Description of the data and file structure
Transcriptomes: The 6 transcriptome fasta filesProceed to Upload
- C. amplexicaulis = C.amplexicaulis_transcriptome.fasta
- C. anceps = C.anceps_transcriptome.fasta
- C. inflatus = C.inflatus_transcriptome.fasta
- S. breweri = S.breweri_transcriptome.fasta
- S. glandulosus = S.glandulosus_transcriptome.fasta
- S. tortuosus = S.tortuosus_transcriptome.fasta
OrthoFinder_Results.tar.gz = The output from orthofinder run using default settings
- Citations.txt
- Phylogenetic_Hierarchical_Orthogroups
- Comparative_Genomics_Statistics
- Phylogenetically_Misplaced_Genes
- Gene_Duplication_Events
- Putative_Xenologs
- Gene_Trees
- Resolved_Gene_Trees
- Log.txt
- Single_Copy_Orthologue_Sequences
- Orthogroup_Sequences
- Species_Tree
- Orthogroups
- WorkingDirectory
- Orthologues
Sharing/Access information
Sequence data can be accessed under BioProject: PRJNA992064
Methods
Seed Source
The six species chosen for transcriptome sequencing were Caulanthus anceps, Caulanthus amplexicaulis, Caulanthus inflatus, Streptanthus breweri, Streptanthus glandulosus, and Streptanthus tortuosus. Seeds for these species were produced in a screenhouse at UC Davis in 2019 using field collected seeds from 2018 and stored in brown envelopes at room temperature. In 2021, these seeds were used to extract RNA under 10 different tissue and treatment combinations (Table 1).
Tissue and treatment combinations
Combination |
Tissue |
Treatment |
Timepoints |
1 |
Seed |
Dry 20 °C |
1 |
2 |
Seed |
Chilled, Imbibed, 4 °C |
1 |
3 |
Leaf |
Normal, 20 °C |
4 |
4 |
Leaf |
Cold, 4 °C |
4 |
5 |
Leaf |
Hot, 40 °C |
4 |
6 |
Leaf |
Dark, 20 °C |
4 |
7 |
Leaf |
Drought, 20 °C |
4 |
8 |
Root |
Normal, 20 °C |
1 |
9 |
Flower |
Normal, 20 °C |
1 |
10 |
Silique |
Normal, 20 °C |
1 |
Tissue Collection
For combination one, dry seeds were removed from their envelopes and immediately frozen in liquid nitrogen. Combination two, dry seeds were removed from their envelopes, placed on top of germination paper in two-inch petri dishes, and imbibed with 3 mL of water. The petri dishes were then placed in a 4 °C chamber with constant light for six hours. After six hours the seeds were dried to remove excess water and frozen in liquid nitrogen.
Combinations three through ten were collected from young plants. Randomized cones containing a mixture of 50% Ron’s Mix and 50% sand were saturated with nutrient water. A small divot was then made in each cone where 3-4 seeds were placed before being covered with a small amount of the soil mixture. The cones were then placed in a rack and covered with plastic wrap to prevent the top layer of the soil from drying out. The covered rack was then placed in a growth chamber set to 20 °C with a light/dark cycle of 12/12. Two weeks after being placed in the growth chamber, the plastic wrap was removed, and the recently germinated seedlings were exposed to the air. The seedlings remained in the growth chamber with nutrient water being provided every other day. Once the seedlings on average had attained approximately 8-10 true leaves, the seedlings were moved to their treatment conditions and/or had their tissue collected. Tissue across multiple replicates of the same tissue/treatment combination and different timepoints were pooled and frozen.
Combination 3 was collected 0, 6, 12, and 18 hours after lights on. Combination 4 was collected 3, 6, 12, and 24 hours after being moved to a 4 °C with a light/dark cycle of 12/12. Combination 5 was collected .25, .5, 1, and 3 hours after being moved to a 40 °C with a light/dark cycle of 12/12. Combination 6 was subjected to a full day without light and the following day tissue was collected in the dark 0, 6, 12, and 18 hours after usual lights on. Combination 7 was subjected to five days without watering and on the sixth day tissue was collected in the 0, 6, 12, and 18 hours after lights on. Plants for combination 8 were removed from their pots, had excess soil washed from their roots, and cleaned root tissue collected. Combinations 9 and 10 had their flowers and siliques collected once an adequate amount had formed.
RNA Extraction and sequencing
Total RNA from 50 mg of each sample was extracted using New England Biolab’s Monarch Total RNA Miniprep Kit. Total RNA was then quantified, and quality checked on the Qubit 3. For each species total RNA from each tissue/treatment was equally pooled and 300 ng of total RNA was used with the SMRTbell Express Template Prep Kit 2.0 to create a total of 6 SMRTbell libraries. The 6 SMRTbell libraries were then sent to the University of California, Davis Genome Center for sequencing on the Sequel II. Following sequencing, the unaligned BAM files were processed using PacBio’s IsoSeq pipeline (“PacificBiosciences/pbbioconda) to create high quality full length non-concatemer reads.
Transcripts, annotation, and orthogroups
Transcript identification for each species was completed using the isON transcript analysis pipeline (Petri and Sahlin, 2023; Sahlin and Medvedev, 2020). The pipeline works on PacBio sequence data in two steps, first by clustering reads based on sequence similarity and then by generating isoforms out of clustered long reads. Following isoform identification, the transcripts were then aligned to the UniProt database (The UniProt Consortium, 2023) using blastx (Sayers et al., 2022). Annotations were added to the transcripts based on their best match. To aide in future investigations, orthogroups were generated using orthofinder (Emms and Kelly, 2019).