Skip to main content

Data from: Analysis of the PEBP gene family and identification of a novel FLOWERING LOCUS T orthologue in sugarcane

Cite this dataset

Venail, Julien et al. (2022). Data from: Analysis of the PEBP gene family and identification of a novel FLOWERING LOCUS T orthologue in sugarcane [Dataset]. Dryad.


Sugarcane (Saccharum spp.) is an important economic crop for both sugar and biomass, the yields of which are negatively affected by flowering. The molecular mechanisms controlling flowering in sugarcane are nevertheless poorly understood. RNA-seq data analysis and database searches have enabled a comprehensive description of the PEBP gene family in sugarcane. It is shown to consist of at least 13 FLOWERING LOCUS T (FT)-like genes, two MOTHER OF FT AND TFL (MFT)-like genes, and four TERMINAL FLOWER (TFL)-like genes. As expected, these genes all show very high homology to their corresponding genes in Sorghum, and also to FT-like, MFT-like, and TFL-like genes in maize, rice, and Arabidopsis. Functional analysis in Arabidopsis showed that the sugarcane ScFT3 gene can rescue the late flowering phenotype of the Arabidopsis ft-10 mutant, whereas ScFT5 cannot. High expression levels of ScFT3 in leaves of short day-induced sugarcane plants coincided with initial stages of floral induction in the shoot apical meristem as shown by histological analysis of meristem dissections. This suggests that ScFT3 is likely to play a role in floral induction in sugarcane; however, other sugarcane FT-like genes may also be involved in the flowering process.


Sugarcane reference transcriptome

Sugarcane genotypes SP91-1049 and SP83-2847 were cultivated in the field for 9 months. Samples of +1 leaf were collected at three different times (7 am, noon and 5 pm) and RNA was extracted using the RNeasy Plant Kit (Qiagen). The RNA quality was checked by 2100 Bioanalyzer (Agilent) and single-end libraries were made using a mRNA-Seq Sample Preparation kit following the manufacturer's instructions (Illumina Inc., San Diego, CA, USA). RNA sequencing was performed on an Illumina Hi-Seq 2,500 platform. All reads were submitted to quality checking using the FASTX-Toolkit and the read-end bases without desirable quality (q20) were trimmed. Reads that gave a Blast alignment match against yeast, bacteria or ribosomal sequences were also excluded. In the end, a total of 630,770,178 reads were selected. The Trinity assembly pipeline was applied to create a non-redundant dataset of 44,558,361 reads (Grabherr et al. 2011. Nature Biotechnology 29, 644–652.). These reads were assembled in different steps using the Velvet and Oases algorithms (Zerbino and Birney, 2008. Genome Research 18, 821–829; Schulz et al., 2012. Bioinformatics 28, 1086–1092). The assembly procedure was based on the concept that different sensitivities can be assessed using different k-mers. We started with k-mer 55 followed by re-assembling the transcripts using smaller k-mers (47, 39 and 31) in consecutive independent steps. The input for each step were the transcripts assembled and unused sequences from the last assembly. This approach resulted in 191,871 sugarcane transcript sequences.


Biotechnology and Biological Sciences Research Council, Award: 2018/08380-0

São Paulo Research Foundation, Award: 2018/08380-0

São Paulo Research Foundation, Award: 2016/02679-9

Coordenação de Aperfeicoamento de Pessoal de Nível Superior, Award: Finance Code 001

National Council for Scientific and Technological Development, Award: 140976/2013–2