High resolution diel transcriptomes of autotetraploid potato reveal expression and sequence conservation among rhythmic genes
Data files
Sep 25, 2025 version files 1.18 GB
-
Dataset_S1_Allelic_groups.csv
3.51 MB
-
Dataset_S10_Pairwise_allelic_expression_correlations_in_Tissue_samples_from_the_Developmental_Gene_Expression_Atlas.csv
6.75 MB
-
Dataset_S11_Pairwise_allelic_expression_correlations_in_Stress_samples_from_the_Developmental_Gene_Expression_Atlas.csv
6.75 MB
-
Dataset_S12_Differential_expression_short_vs._long_day_determined_by_DEseq.csv
16.50 MB
-
Dataset_S13_Differential_expression_leaf_vs._tuber_under_short_days_determined_by_DEseq.csv
16.42 MB
-
Dataset_S14_Functional_annotation_of_Atlantic_using_MapMan.txt
31.83 MB
-
Dataset_S2_Diel_expression_rlog_long.csv
862.89 MB
-
Dataset_S3_Expression_of_Tissue_samples_from_the_Developmental_Gene_Expression_Atlas.csv
107.13 MB
-
Dataset_S4_Expression_of_Stress_samples_from_the_Developmental_Gene_Expression_Atlas.csv
84.17 MB
-
Dataset_S5_Leaf_short_day_cycling_parameters_as_determined_per_JTK.csv
10.30 MB
-
Dataset_S6_Leaf_long_day_cycling_parameters_as_determined_per_JTK.csv
11.56 MB
-
Dataset_S7_Tuber_short_day_cycling_parameters_as_determined_per_JTK.csv
7.93 MB
-
Dataset_S8_Pairwise_allelic_expression_correlations_in_short_days.csv
6.67 MB
-
Dataset_S9_Pairwise_allelic_expression_correlations_in_long_days.csv
6.67 MB
-
README.md
11.74 KB
Abstract
Background
Photoperiodic changes in diel cycles of gene expression are pervasive in plants. The timing of circadian regulators, together with light signals, regulate multiple photoperiod-dependent responses such as growth, flowering or tuber formation. However, for most genes, the importance of cyclic mRNA levels is less clear. We analyzed the diel transcriptome of modern cultivated potato, a highly heterozygous autotetraploid. Clonal propagation and limited meiosis have led to the accumulation of deleterious alleles, making tetraploid potato an ideal model system to investigate the conservation of cyclic expression and cyclic genes during artificial selection and clonal propagation.
Results
Our results indicate that rhythmic alleles of cultivated potato were more highly expressed than non-rhythmic genes and were highly co-expressed not only under diel cycles but also across tissues, developmental stages, and stress conditions. Moreover, the smaller ratio of non-synonymous to synonymous differences within rhythmic versus non-rhythmic allelic groups indicates that cyclic genes, in general, have more conserved core functions than non-cyclic genes. In accordance with this observation, fully rhythmic allelic groups were highly enriched in photosynthesis and ribosome biogenesis genes, which have core functions in plants. Furthermore, we investigated differences in cyclic expression patterns between photoperiod identifying potential regulators of the strong photoperiodic change in phase of expression for ribosome biogenesis and pathogen response genes. Finally, analyses of genes involved in tuber formation suggests that the regulation of CO gene transcription is not the only factor enabling tuberization under long days in modern cultivated potato.
Conclusions
This study not only provides high quality diel transcriptomic datasets of cultivated potato but also provides important insight on the role of allelic diversity in rhythmic expression in plants.
Principle Investigator Contact Information
Name: Eva M. Farre
Institution: Michigan State University
Email: farre@msu.edu
Dataset Overview
This repository contains the large datasets used in the manuscript titled 'High resolution diel transcriptomes of autotetraploid potato reveal expression and sequence conservation among rhythmic genes' by Feke et al., accepted in the BMC Genomics. The code used to generate the plots on the publication can be found here: https://github.com/efarre/Autotetraploid_potato_diel_transcriptome/tree/main.
This repository contains the allelic groups for S. tuberosum cv. Atlantic used in this study (Dataset S1). It contains the normalized expression of S. tuberosum cv. Atlantic leaf tissue under short and long days, and tuber tissue under short days (Dataset S2). As well as the normalized expression of the Atlantic Developmental Gene Expression Atlas(Datasets S3 and S4). It also includes the rhythmic expression analysis results under diel conditions for leaf and tuber tissue (Datasets S5, S6, S7). The repository also contains the pairwise allelic expression correlations of the diel expression datasets (Datasets S8 and S9) and the Atlantic Developmental Gene Expression Atlas (Datasets S10 and S11), as well as the results of the differential expression anlysis between photoperiods of leaf tissue (Dataset S12) and between leaf and tuber tissues (Dataset S13). Dataset S14 contains the MapMan functional annnotation results.
Description of the data and file structure
File: Dataset_S1_Allelic_groups.csv
Description: Synthenic allelic groups in Solanum tuberosum cv. Atlantic v3 and S. tuberosum Group Phureja DM 1-3 516 R44 (DM) v6.1.
Variables
- Syntelog: allelic groups
- geneID: gene ID
File: Dataset_S2_Diel_expression_rlog_long.csv
Description: Gene expression in Atlantic leaves under short (12 h light/12 h dark) and long days (16 h light/8 h dark) and tubers under short days normalized using DEseq, rlog values are provided. Original data from BioProjects PRJNA957457 (short day) and PRJNA1093480 (long day).
Variables
- geneID: based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- ZT: time after dawn
- SD: short day, light period ZT0-ZT12
- LD: long day, light period ZT0-ZT16
- Expression: in rlog (generated using DEseq)
File: Dataset_S3_Expression_of_Tissue_samples_from_the_Developmental_Gene_Expression_Atlas.csv
Description: Gene expression was normalized using DEseq and rlog values are provided. Atlantic Developmental Gene Expression Atlas data were obtained from NCBI under BioProject PRJNA753086. Plant growth and tissue harvest methods are described in doi: https://doi.org/10.1101/2025.06.26.661617.
Variables
- geneID: based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Sample column labels: Samplename_replicate. Abbreviations:
- R#: replicate number
- TuberS#: tuber developmental stages S1-S4 were collected for this experiment.
- YL: young leaf:
- ImmFruit: immature fruit.
File: Dataset_S4_Expression_of_Stress_samples_from_the_Developmental_Gene_Expression_Atlas.csv
Description: Gene expression was normalized using DEseq and rlog values are provided. Atlantic Developmental Gene Expression Atlas data were obtained from NCBI under BioProject PRJNA753086. Plant growth and tissue harvest methods are described in doi: https://doi.org/10.1101/2025.06.26.661617.
Variables
- geneID: based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Sample column labels: SampleName_replicate. Abbreviations:
- R#: replicate number
- Meja: methyl jasmonate
- BTH: benzothiodiazole
File: Dataset_S5_Leaf_short_day_cycling_parameters_as_determined_per_JTK.csv
Description: Rhythmic parameters determined using JTK implemented in MetaCycle using rlog normalized data from short day time course in leaf tissue.
Variables
- CycID: gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- BH.Q:Benjamini–Hochberg q-value
- ADJ.P: Adjusted p=value
- PER: period
- LAG: phase
- AMP: amplitude
File: Dataset_S6_Leaf_long_day_cycling_parameters_as_determined_per_JTK.csv
Description: Rhythmic parameters determined using JTK implemented in MetaCycle using rlog normalized data from long day time course in leaf tissue.
Variables
- CycID: gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- BH.Q: Benjamini–Hochberg q-value
- ADJ.P: Adjusted p=value
- PER: period
- LAG: phase
- AMP: amplitude
File: Dataset_S7_Tuber_short_day_cycling_parameters_as_determined_per_JTK.csv
Description: Rhythmic parameters determined using JTK implemented in MetaCycle using rlog normalized data from short day time course in tuber tissue.
Variables
- CycID: gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- BH.Q:Benjamini–Hochberg q-value
- ADJ.P: Adjusted p=value
- PER: period
- LAG: phase
- AMP: amplitude
File: Dataset_S8_Pairwise_allelic_expression_correlations_in_short_days.csv
Description: The Pearson correlation of the z-scored expression values (rlog) was calculated for each pair of alleles. Leaf short day expression data from Dataset S2 was used to generate these correlations.
Variables
- Syntelog: allelic group
- Haplotype_1: allele 1, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Haplotype_2: allele 2, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Correlation: Pearson correlation, no correlation value is provided for pairs in which one haplotype is not expressed.
File: Dataset_S9_Pairwise_allelic_expression_correlations_in_long_days.csv
Description: The Pearson correlation of the z-scored expression values (rlog) was calculated for each pair of alleles.Leaf long day expression data from Dataset S2 was used to generate these correlations.
Variables
- Syntelog: allelic group
- Haplotype_1: allele 1, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Haplotype_2: allele 2, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Correlation: Pearson correlation, no correlation value is provided for pairs in which one haplotype is not expressed.
File: Dataset_S10_Pairwise_allelic_expression_correlations_in_Tissue_samples_from_the_Developmental_Gene_Expression_Atlas.csv
Description: The Pearson correlation of the z-scored expression values (rlog) was calculated for each pair of alleles. The rlog expression values are from Dataset S3. Atlantic Developmental Gene Expression Atlas data were obtained from NCBI under BioProject PRJNA753086. Plant growth and tissue harvest methods are described in doi: https://doi.org/10.1101/2025.06.26.661617.
Variables
- Syntelog: allelic group
- Haplotype_1: allele 1, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Haplotype_2: allele 2, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Correlation: Pearson correlation, no correlation value is provided for pairs in which one haplotype is not expressed.
File: Dataset_S11_Pairwise_allelic_expression_correlations_in_Stress_samples_from_the_Developmental_Gene_Expression_Atlas.csv
Description: The Pearson correlation of the z-scored expression values (rlog) was calculated for each pair of alleles.The rlog expression values are from Dataset S4. Atlantic Developmental Gene Expression Atlas data were obtained from NCBI under BioProject PRJNA753086. Plant growth and tissue harvest methods are described in doi: https://doi.org/10.1101/2025.06.26.661617.
Variables
- Syntelog: allelic group
- Haplotype_1: allele 1, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Haplotype_2: allele 2, gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- Correlation: Pearson correlation, no correlation value is provided for pairs in which one haplotype is not expressed.
File: Dataset_S12_Differential_expression_short_vs._long_day_determined_by_DEseq.csv
Description: Differential gene expression between short and long days in leaves as determined by DEseq. For this analysis, the time component of RNAseq samples was disregarded. Note that genes with a lower than a threshold value of counts or outlier expression values have no output in DEseq (empty cells).
Variables
- V1: gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- baseMean: The average of the normalized count values, dividing by size factors, taken over all samples.
- log2FoldChange: the effect size estimate. This value indicates how much the gene or transcript's expression seems to have changed between the comparison and control groups. This value is reported on a logarithmic scale to base 2
- lfcSE: The standard error estimate for the log2 fold change estimate
- stat: The value of the test statistic for the gene or transcript.
- pvalue: P-value of the test for the gene or transcript.
- padj: Adjusted P-value for multiple testing for the gene or transcript.
File: Dataset_S13_Differential_expression_leaf_vs._tuber_under_short_days_determined_by_DEseq.csv
Description: Differential gene expression between leaf and tuber in short days as determined by DEseq. For this analysis, the time component of RNAseq samples was disregarded. Note that genes with a lower than a threshold value of counts or outlier expression values have no output in DEseq (empty cells).
Variables
- V1: gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- baseMean: The average of the normalized count values, dividing by size factors, taken over all samples.
- log2FoldChange: he effect size estimate. This value indicates how much the gene or transcript's expression seems to have changed between the comparison and control groups. This value is reported on a logarithmic scale to base 2
- lfcSE: The standard error estimate for the log2 fold change estimate
- stat: The value of the test statistic for the gene or transcript.
- pvalue: P-value of the test for the gene or transcript.
- padj: Adjusted P-value for multiple testing for the gene or transcript.
File: Dataset_S14_Functional_annotation_of_Atlantic_using_MapMan.txt
Description: Mercator4 v7.0 (www.plabipd.de/mercator_main.html) and the S. tuberosum cv. Atlantic v3 high confidence representative gene models from SpudDB were used.
Variables
- BINCODE: Code for functional group
- NAME: Name of functional group
- IDENTIFIER: gene ID based on Atlantic Genome Assembly (v3) (https://spuddb.uga.edu)
- DESCRIPTION: description of functional group
- TYPE: The type of the item (T=Transcript, M=Metabolite, P=Protein, E= Enzyme)
