Cryptic CAM photosynthesis in Joshua tree (Yucca brevifolia, Y. jaegeriana)

Published Aug 15, 2025 on Dryad. https://doi.org/10.5061/dryad.7pvmcvf3d

Data files

Aug 15, 2025 version files 256.64 MB

2024-04-17_counts_norm.txt

60.39 MB
2024-04-17_counts_raw.txt

30.14 MB
2024-04-17_tpm_norm.txt

60.93 MB
2024-04-17_tpm_raw.txt

60.93 MB
full_RNA_metadata_dryad.csv

6.71 KB
primaryYucca_jaegeriana_transcripts.fa

44.24 MB
README.md

5.29 KB

Abstract

Joshua trees are an iconic species of the Mojave Desert, but face threats from changes to the climate and land use. Here we uncover cryptic Crassulacean acid metabolism (CAM photosynthesis) in Joshua trees via a common garden, and use genomic data to understand other metabolic differences between populations and the two species of Joshua tree. Our Dryad data package includes RNA sequencing data from a common garden experiment on Joshua tree species (Yucca brevifolia and Yucca jaegeriana). Samples were taken in 2022 from just a single garden and were combined with ecophysiological measurements and metabolomics. Our results indicate low level CAM activity in all populations of Joshua tree, as well as strong differentiation in aspects of carbon metabolism between the two species.

Dataset DOI: 10.5061/dryad.7pvmcvf3d

Description of the data and file structure

To understand the presence, variability, and strength of Crassulacean acid metabolism (CAM) photosynthesis, we collected RNA-sequencing data in 2022 from both Joshua tree species (Yucca brevifolia, Yucca jaegeriana) from plants in a common garden. Data package includes only computed read count and transcript per million (TPM) values and the reference used for read mapping of samples collected in 2022. The 2021 data is available as raw reads on SRA. For raw reads of both 2021 and 2022 collections, please see SRA BioProject PRJNA1132710.

Files and variables

File: 2024-04-17_counts_raw.txt

Description: Raw count data as computed by sleuth from 2022 samples mapped to the primary transcript reference file.

Variables

Columns: Library ID. See "full_RNA_metadata_dryad.csv" for information.
Rows: primary transcript ID

File: 2024-04-17_tpm_raw.txt

Description: Raw TPM data as computed by sleuth from 2022 samples mapped to the primary transcript reference file.

Variables

Columns: Library ID. See "full_RNA_metadata_dryad.csv" for information.
Rows: primary transcript ID

File: 2024-04-17_counts_norm.txt

Description: Normalized count data as computed by sleuth from 2022 samples mapped to the primary transcript reference file.

Variables

Columns: Library ID. See "full_RNA_metadata_dryad.csv" for information.
Rows: primary transcript ID

File: 2024-04-17_tpm_norm.txt

Description: Normalized TPM data as computed by sleuth from 2022 samples mapped to the primary transcript reference file.

Variables

Columns: Library ID. See "full_RNA_metadata_dryad.csv" for information.
Rows: primary transcript ID

File: full_RNA_metadata_dryad.csv

Description: Metadata for each library sequences (headers in the TPM and count files).

Variables

sample: library ID (internal number)
sourcepop: full name of source population
populationcode: Abbreviation for the source population from which the individual came.
time: Time of day of sample, AM (morning) or PM (night)
species: Which of the two species the sample came from (Eastern = Y. jaegeriana, Western = Y. brevifolia, hybrid = hybrid of the two.)
tag: the unique tag identifier for the plant
matriline: matriline information for the plant

File: primaryYucca_jaegeriana_transcripts.fa

Description: Transcript file derived from a preliminary annotation of the Y. jaegeriana genome. Transcripts are only the primary isoforms annotated per locus.

Genome assembly:

A total of 136.8 Gb of PacBio CCS reads were generated using a PacBio Sequel 2 at the HudsonAlpha Institute of Biotechnology. PacBio HiFi libraries had insert sizes ranging from 17-23.6 Kb. An estimated 44.1X coverage of the genome was generated. Initial assembly was completed using hifiasm 0.19.8 with default parameters.

Genome annotation:

The assembly was annotated using BRAKER v.3.0.4 using gene models from *Yucca aloifolia *(Ya24Inoko_839 v.2.1) and *Yucca filamentosa *(YfilamentosaC3pri_837_v.2.1) and evidence from transcriptomes generated in this manuscript. RepeatModeler v.2.0.3 (Flynn et al., 2020) and RepeatMasker v.4.1.2-p1 were run to identify the consensus repeat families and softmasking the repeat regions in the genome with default parameters, respectively. TSEBRA v.1.1.2.1(Gabriel et al., 2021) was used to retrieve BRAKER's filtered single exon genes. We ran InterProScan v.5.19-58.0 (Jones et al., 2014) to obtain protein evidence of the genes provided by TSEBRA output. All the single- exon genes lacking any protein evidence were filtered out from the gff3 file using AGAT v.0.7.0 (Dainat, 2021). The output gff3 file from AGAT was used as the final annotation and for downstream analysis.

Dainat J. 2021. AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format (v0. 8.0). Zenodo.

Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proceedings of the National Academy of Sciences of the United States of America 117: 9451–9457.

Gabriel L, Hoff KJ, Brůna T, Borodovsky M, Stanke M. 2021. TSEBRA: transcript selector for BRAKER. BMC Bioinformatics 22: 566.

Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, et al. 2014. InterProScan 5: genome-scale protein function classification. Bioinformatics 30: 1236–1240.

Code/software

All data can be viewed in standard text editors.

Access information

Other publicly accessible locations of the data:

NCBI SRA BioProject PRJNA1132710

Cryptic CAM photosynthesis in Joshua tree (Yucca brevifolia, Y. jaegeriana)

Data files

Abstract

README: Cryptic CAM photosynthesis in Joshua tree (Yucca brevifolia, Y. jaegeriana)

Description of the data and file structure

Files and variables

File: 2024-04-17_counts_raw.txt

Variables

File: 2024-04-17_tpm_raw.txt

Variables

File: 2024-04-17_counts_norm.txt

Variables

File: 2024-04-17_tpm_norm.txt

Variables

File: full_RNA_metadata_dryad.csv

Variables

File: primaryYucca_jaegeriana_transcripts.fa

Code/software

Access information