Skip to main content

Co-expression networks in Chlamydomonas reveal significant rhythmicity in batch cultures and empower gene function discovery

Cite this dataset

Salomé, Patrice; Merchant, Sabeeha (2021). Co-expression networks in Chlamydomonas reveal significant rhythmicity in batch cultures and empower gene function discovery [Dataset]. Dryad.


The unicellular green alga Chlamydomonas reinhardtii is a choice reference system for the study of photosynthesis and chloroplast metabolism, cilium assembly and function, lipid and starch metabolism, and metal homeostasis. Despite decades of research, the functions of thousands of genes remain largely unknown, and new approaches are needed to categorically assign genes to cellular pathways. Growing collections of transcriptome and proteome data now allow a systematic approach based on integrative co-expression analysis. We used a dataset comprising 518 deep transcriptome samples derived from 58 independent experiments to identify potential co-expression relationships between genes. We visualized co-expression potential with the R package corrplot, to easily assess co-expression and anti-correlation between genes. We extracted several hundred high-confidence genes at the intersection of multiple curated lists involved in cilia, cell division, and photosynthesis, illustrating the power of our method. Surprisingly, Chlamydomonas experiments retained a significant rhythmic component across the transcriptome, suggesting an underappreciated variable during sample collection, even in samples collected in constant light. Our results therefore document substantial residual synchronization in batch cultures, contrary to assumptions of asynchrony. We provide step-by-step protocols for the analysis of co-expression across transcriptome data sets from Chlamydomonas and other species to help foster gene function discovery


For Chlamydomonas, the data set consists of 518 RNA-seq samples derived from 58 independent experiments, most published.

For Arabidospis, the data sets were downloaded from the AtGenExpress project site (, and collated into a single file that consisted of 34 Arabidopsis accessions, 16 sets of etiolated seedlings exposed to various light treatments, 36 sets of seedlings exposed to pathogens, 13 cell culture samples, 68 sets each for shoots and roots exposed to various abiotic stresses, 79 developmental samples (72 from shoots or leaves, 7 from roots), and 18 sets each for leaves and roots subjected to iron deficiency, with controls included. 

Usage notes

For Chlamydomonas, all genes were retained, and the data normalized by calculating the log2(FPKM+1).

For Arabidopsis, control probes were removed manually prior to normalization.