Parasitic flowering plants are one of the most destructive agricultural pests and have major impact on crop yields throughout the world. Being dependent on finding a host plant for growth, parasitic plants penetrate their host using specialized organs called haustoria. Haustoria establish vascular connections with the host, which enable the parasite to steal nutrients and water. The underlying molecular and developmental basis of parasitism by plants is largely unknown. In order to investigate the process of parasitism, RNAs from different stages, i.e. seed, seedling, vegetative strand, prehaustoria, haustoria, and flowers, were used to de-novo assemble and annotate the transcriptome of the obligate plant stem parasite Cuscuta pentagona (dodder). The assembled transcriptome was used to dissect transcriptional dynamics during dodder development and parasitism, and identified key gene categories involved in the process of plant parasitism. Host plant infection is accompanied with increased expression of parasite genes underlying transport and transporter categories, response to stress and stimuli, as well as genes encoding enzymes involved in cell wall modifications. By contrast expression of photosynthetic genes is decreased in the dodder infective stages compared to normal stem. In addition, genes relating to biosynthesis, transport and response of phytohormones, such as auxin, gibberellins and strigolactone, were differentially expressed in the dodder infective stages compared to stem and seedlings. This analysis sheds light on the transcriptional changes that accompany plant parasitism and will aid in identifying potential gene targets for use in controlling infestation of crops by parasitic weeds.
Supplemental_figures S1 - S7
Supplemental Figure S1. Flow-chart showing steps in dodder transcriptome assembly and annotation, and downstream transcript clustering and differential expression analysis.
Supplemental Figure S2. Transcript size distribution for Dodder_all_transcriptome.
Supplemental Figure S3. Expression of non-annotated transcripts as detected by RT-PCR in dodder stems.
Supplemental Figure S4. Pie charts for multilevel GO distribution of annotated transcripts in three categories: biological processes (A), cellular components (B) and molecular function (C).
Supplemental Figure S5. Histogram representation of GOslim classification in three categories: biological processes (A), molecular function (B) and cellular components (C).
Supplemental Figure S6. Distribution of transcripts annotated as enzymes among different enzyme classes.
Supplemental Figure S7. Multidimensional scaling (MDS) plot of all replicates of each dodder tissue used for transcriptome assembly and, subsequently, transcript clustering and differential expression analysis.
Supplemental_figures.pdf
Supplemental_Tables I - III
Supplemental Table I. Distribution of percent length coverage for the top matching UniProt database entries.
Supplemental Table II. Size statistics and percentage of read mapping to annotated and un-annotated transcripts.
Supplemental Table III. Primers used in RT-PCR analysis.
Supplemental_Tables.pdf
Supplemental Dataset 1: Dodder_all_transcriptome.fa
Supplemental Dataset 1. Sequences of all transcripts of Dodder_all_transcriptome in FASTA format. The file can also be downloaded as a FASTA file at http://de.iplantcollaborative.org/dl/e00e9ea8-6aad-439b-a1fa-49dd3939693d. The transcripts were named as Cpent_contig plus a serial number. Also included are the Trinity identifiers and length of each transcript in header description. These transcripts have also been deposited at DDBJ/EMB/GenBank under the accession GAON00000000.
Dodder_all_transcriptome.fa
Supplemental Dataset 2: Dodder_final_transcriptome.fa
Supplemental Dataset 2. Sequences of all transcripts of Dodder_final_transcriptome in FASTA format, obtained after filtering and clustering of transcripts from Dodder_all_transcriptome.fa as described in materials and methods. The FASTA file can be downloaded at http://de.iplantcollaborative.org/dl/a85e3682-4315-43e8-9f9d-597908616b4a. The Trinity identifiers and length of each transcript are included in header description.
Dodder_final_transcriptome.fa
Supplemental Dataset 3: Dodder_predicted_CDS.fa
Supplemental Dataset 3. Sequences of all predicted ORFs in FASTA format from Dodder_final_transcriptome. The FASTA file can also be downloaded at http://de.iplantcollaborative.org/dl/f3eb4e4a-05b8-4db4-94f2-87fb77c1622a. The ORFs
were named as Cpent_putative_CDS plus a serial number. Also included are the Trinity identifiers and type of ORFs in header description.
Dodder_predicted_CDS.fa
Supplementary Dataset 4. Annotation of transcripts form Dodder_final_transcriptome
Supplemental Dataset 4. Combined annotation of dodder transcripts obtained from BLASTX against nr and TAIR10 database. Column titles: (1) Contig_id, (2) Trinity identifier, (3) length of contig, (4) sequence description from Blast2GO, (5) top hit sequence ID from nr database, (6) top hit gene ID from TAIR10, (7) top hit gene name from TAIR10 and (8) gene description from TAIR10.
Supplementary Dataset 4.txt
Supplementary Dataset 5: GO chart data
Supplemental Dataset 5. GO chart data i.e. number of annotated transcripts in each GO-category along with GO-level, score and parent GO-terms for each category under biological process (sheet1), molecular function (sheet2) and cellular component (sheet3). Columns titles: (1) GO-level, (2) GO-term (Accession), (3) GO-term (Name), (4) Number of contigs belonging to the GO-term, (5) Score, (6) Parents GO-term (Accession) and (7) Parents GO-term (Name).
Supplementary Dataset 5.xls
Supplementary Dataset 6: GO-ids for annotated dodder transcripts
Supplemental Dataset 6. GO-ids for all annotated dodder transcripts: all GO-ids (sheet1) and GO-slim ids (sheet2). Column titles: (1) Contig ID, (2) Trinity identifier and (3) GO-term (Accession).
Supplementary Dataset 6.xlsx.zip
Supplementary Dataset 7: Enzyme code distribution from KEGG
Supplemental Dataset 7. Enzyme code distribution from KEGG for all annotated dodder transcripts. Column titles: (1) Enzymatic Pathway, (2) Number of sequences in the pathway, (3) Enzyme class, (4) Enzyme ID/EC (Enzyme Code) number, (5) Number of annotated contigs in the class, (6) Trinity identifiers of the contigs and (7) Enzymatic Pathway ID.
Supplementary Dataset 7.txt
Supplementary Dataset 8: Normalized RSEM-estimated counts
Supplemental Dataset 8. Normalized RSEM-estimated counts for all replicates of each dodder tissue used for transcript clustering and differential expression analysis. There were eight replicates for stems, prehaustoria, haustoria and flowers (four replicates each from dodder grown on two host plants, tomato and tobacco) and four replicates for seeds and seedlings.Column titles: (1) Contig ID, (2) Trinity identifier, (3 - 10) eight replicates of flowers, (11 - 18) eight replicates of haustoria, (19 - 26) eight replicates of prehaustoria, (27 - 30) four replicates of seedlings, (31 - 34) four replicates of seeds and (35 - 42) eight replicates of stem.
Supplementary Dataset 8.txt
Supplementary Dataset 9: Enriched GO-categories for PCA-SOM clusters
Supplemental Dataset 9. Enriched GO-categories, both overall GO and GOslim, for all clusters generated from principal component analysis with self-organizing map. Cluster 1
– 12 are shown sequentially on sheet 1 – 12.
Supplementary Dataset 9.xls
Supplementary Dataset 10: Differentially expressed transcripts for each pair-wise comparison
Supplemental Dataset 10. Differentially expressed transcripts (logFC " 1; FDR < 0.05) for each pair-wise comparison for all dodder tissues: flowers_vs_haustoria (sheet 1); flowers_vs_prehaustoria (sheet 2); flowers_vs_seedlings (sheet 3); flowers_vs_seeds (sheet 4); flowers_vs_stems (sheet 5); haustoria_vs_prehaustoria (sheet 6); haustoria_vs_seedlings (sheet 7); haustoria_vs_seeds (sheet 8); haustoria_vs_stems (sheet 9); prehaustoria_vs_seedlings (sheet 10); prehaustoria_vs_seeds (sheet 11); prehaustoria_vs_stems (sheet 12); seedlings_vs_seeds (sheet 13); seedlings_vs_stems (sheet 14) and seeds_vs_stems (sheet 15). Column titles: (1) Contig ID, (2) Trinity identifier, (3) logFC (FoldChange), (4) logCPM (Counts Per Million), (5) PValue, (6) FDR (False discovery rate), (7) Sequence description from Blast2GO, (8) Top hit sequence ID from nr database, (9) Top hit gene ID from TAIR10, (10) Top hit gene name from TAIR10 and (11) Top hit gene description from TAIR10.
Supplementary Dataset 10.xlsx.zip
Supplementary Dataset 11: Up-regulated and down-regulated transcripts at prehaustorial stage
Supplemental Dataset 11. Shared up-regulated and down-regulated transcripts at prehaustorial stage compared to seedlings and stems and associated enriched GO categories: up-regulated transcripts (sheet 1); down-regulated transcripts (sheet 2); enriched GO-terms in up-regulated transcripts (sheet3) and enriched GO-terms in downregulated transcripts (sheet4). Column titles for sheet 1 and sheet2: (1) Contig ID, (2) Trinity identifier, (3) Sequence description from Blast2GO, (4) gene name from TAIR10 and (5) gene description from TAIR10.
Supplementary Dataset 11.xls
Supplementary Dataset 12: Up-regulated and down-regulated transcripts at haustorial stage
Supplemental Dataset 12. Shared up-regulated and down-regulated transcripts at haustorial stage compared to prehaustoria, seedlings and stems and associated enriched
GO-categories: up-regulated transcripts (sheet 1); down-regulated transcripts (sheet 2); enriched GO-terms in up-regulated transcripts (sheet3) and enriched GO-terms in downregulated transcripts (sheet4). Column titles for sheet 1 and sheet2: (1) Contig ID, (2) Trinity identifier, (3) Sequence description from Blast2GO, (4) gene name from TAIR10 and (5) gene description from TAIR10.
Supplementary Dataset 12.xls
Supplementary Dataset 13. Enriched GO terms for differentially expressed transcripts
Supplemental Dataset 13. GO terms enriched for shared transcripts upregulated or downregulated for each dodder developmental stage compared to all other stages: seeds
(sheet 1); seedlings (sheet 2); stem (sheet 3); prehaustoria (sheet 4); haustoria (sheet4) and flowers (sheet 5).
Supplementary Dataset 13.xls
Supplementary Dataset 14: Non-annotated transcripts differentially expressed in dodder infective stages
Supplemental Dataset 14. List of non-annotated transcripts showing differential expression (logFC " 1; FDR < 0.05) in prehaustorial and haustorial stages compared to seedlings and stems. Column titles: (1) Contig ID, (2) Trinity identifier, (3) length of contig and (4) ORF prediction.
Supplementary Dataset 14.xls
Supplementary Dataset 15: Transcripts underlying GO-terms transport (GO:0006810) and transporter (GO:0005215) showing increased expression in preahustoria and haustoria.
Supplemental Dataset 15. Transcripts underlying GO-terms transport (GO:0006810) and transporter (GO:0005215) represented among upregulated genes in prehaustorial stage
compared to seedlings and stems (sheet 1), and in haustorial stage compared to prehaustoria, seedlings and stems (sheet 2).
Supplementary Dataset 15.xls