Data from: Horizontal transfer of an adaptive chimeric photoreceptor from bryophytes to ferns

 

Files in this package

Content in the Dryad Digital Repository is offered "as is." By downloading files, you agree to the Dryad Terms of Service. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to this data. CC0 (opens a new window) Open Data (opens a new window)

Title IGPD_alignment
Downloaded 23 times
Description Alignment of land plant imidazoleglycerol-phosphate dehydratase (IGPD). Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download IGPD_alignment.fas (33.96Kb)
Details View File Details
Title IGPD_ML_tree_raxml
Downloaded 10 times
Description The maximum likelihood tree inferred from "IGPD_alignment.fas", using RAxML with 100 random starting trees. We partitioned the data by codon position, with each partition given a GTR+Γ+I model as suggested by PartitionFinder under the Akaike Information Criterion. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download IGPD_ML_tree_raxml.tre (4.045Kb)
Details View File Details
Title IGPD_MLBS_tree_raxml
Downloaded 5 times
Description The maximum likelihood bootstrapping trees from "IGPD_alignment.fas", using RAxML (1000 replicates). We partitioned the data by codon position, with each partition given a GTR+Γ+I model as suggested by PartitionFinder under the Akaike Information Criterion. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download IGPD_MLBS_tree_raxml.tre (1.764Mb)
Details View File Details
Title PHOT_alignment
Downloaded 11 times
Description Alignment of plant phototropin and neochrome, containing 163 sequences from 106 species. We only included the conserved domains (i.e., LOV1, LOV2 and STK); the domain boundaries were identified by querying each scaffold against the NCBI Conserved Domain Database. Each domain was separately aligned (based on the amino acid sequences) using Muscle, and then concatenated. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_alignment.fas (285.3Kb)
Details View File Details
Title PHOT_ML_tree_garli
Downloaded 4 times
Description The maximum likelihood tree inferred from "PHOT_alignment.fas", using Garli with genthreshfortopoterm set to 1,000,000 and 8 independent runs. We partitioned the data by codon position, with each partition given a GTR+Γ+I model as suggested by PartitionFinder under the Akaike Information Criterion. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_ML_tree_garli.tre (11.29Kb)
Details View File Details
Title PHOT_ML_tree_codonPhyML
Downloaded 7 times
Description The maximum likelihood tree inferred from "PHOT_alignment.fas", using CodonPhyML. We used the GY model with four categories of non-synonymous/synonymous substitution rate ratios drawn from the discrete gamma distribution, and codon frequencies were estimated from the data under the F3X4 model. The tree topology search was done using the NNI approach, and branch support was estimated using the SH-like aLRT method. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_ML_tree_codonPhyML.tre (11.61Kb)
Details View File Details
Title PHOT_ML_tree_nhPhyml
Downloaded 3 times
Description The maximum likelihood tree inferred from "PHOT_alignment.fas", using nhPhyML. The analysis was carried out with ten discrete categories of GC equilibrium frequencies, and the required starting tree was the best tree from the Garli analysis ("PHOT_ML_tree_garli.tre"). Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_ML_tree_nhPhyml.tre (13.11Kb)
Details View File Details
Title PHOT_MLBS_tree_raxml
Downloaded 2 times
Description The maximum likelihood bootstrapping trees from "PHOT_alignment.fas", using RAxML (1000 replicates). We partitioned the data by codon position, with each partition given a GTR+Γ+I model as suggested by PartitionFinder under the Akaike Information Criterion. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_MLBS_tree_raxml.tre (5.5Mb)
Details View File Details
Title PHOT_MLBS_tree_nhPhyml
Downloaded 5 times
Description The maximum likelihood bootstrapping trees from "PHOT_alignment.fas", using nhPhyML (1000 replicates). The analysis was carried out with ten discrete categories of GC equilibrium frequencies, and for each replicate, RAxML was used to input the starting tree. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_MLBS_tree_nhPhyml.tre (12.12Mb)
Details View File Details
Title PHOT_BI_con_tree_MrBayes
Downloaded 3 times
Description The 50% majority consensus tree from MrBayes run (25% of the total generations were discarded as burn-in), based on "PHOT_alignment.fas". The analysis was carried out with two independent MCMC runs, four chains each, and trees sampled every 1000 generations. Substitution parameters were unlinked and the rate prior was set to vary among partitions. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_BI_con_tree_MrBayes.tre (112.7Kb)
Details View File Details
Title PHOT_BI_chronogram_BEAST
Downloaded 2 times
Description The chronogram of plant phototropin and neochroem, inferred from "PHOT_alignment.fas", using BEAST. A total of 15 tmrca priors were employed as the calibration points (see SI Appendix), and a birth-death speciation prior was used as the tree prior. We used the uncorrelated relaxed-clock model with rates drawn from a lognormal distribution. A starting tree was first estimated by r8s and provided to BEAST to initiate the run. Two independent MCMC runs were carried out and the output was inspected in Tracer to ensure convergence and mixing (effective sample sizes all > 200). The trees from the two runs were combined in LogCombiner with a 25% burn-in and summarized in TreeAnnotator. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHOT_BI_chronogram_BEAST.tre (181.1Kb)
Details View File Details
Title PHY_alignment
Downloaded 9 times
Description Alignment of plant phytochrome and neochrome, containing 139 sequences from 76 species. We only included the conserved domains (i.e., PAS, GAF, PHY, PAS repeats, HisKA and HATPase); the domain boundaries were identified by querying each scaffold against the NCBI Conserved Domain Database. Each domain was separately aligned (based on the amino acid sequences) using Muscle, and then concatenated. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHY_alignment.fas (394.0Kb)
Details View File Details
Title PHY_ML_tree_garli
Downloaded 5 times
Description The maximum likelihood tree inferred from "PHY_alignment.fas" (translated into amino acids), using Garli with genthreshfortopoterm set to 1,000,000 and 8 independent runs. Using ProtTest (65), JTT + F was found to be the best empirical substitution model under the Akaike Information Criterion. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHY_ML_tree_garli.tre (36.96Kb)
Details View File Details
Title PHY_ML_tree_codonPhyML
Downloaded 3 times
Description The maximum likelihood tree inferred from "PHY_alignment.fas", using CodonPhyML. We used the GY model with four categories of non-synonymous/synonymous substitution rate ratios drawn from the discrete gamma distribution, and codon frequencies were estimated from the data under the F3X4 model. The tree topology search was done using the NNI approach, and branch support was estimated using the SH-like aLRT method. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHY_ML_tree_codonPhyML.tre (9.731Kb)
Details View File Details
Title PHY_MLBS_tree_raxml
Downloaded 5 times
Description The maximum likelihood bootstrapping trees from "PHY_alignment.fas" (translated into amino acids), using RAxML (1000 replicates) under JTT + F model. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHY_MLBS_tree_raxml.tre (4.524Mb)
Details View File Details
Title PHY_BI_con_tree_MrBayes
Downloaded 4 times
Description The 50% majority consensus tree from MrBayes run (25% of the total generations were discarded as burn-in), based on "PHY_alignment.fas" (translated into amino acids). The analysis was carried out with two independent MCMC runs, four chains each, trees sampled every 1000 generations and JTT + F model. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download PHY_BI_con_tree_MrBayes.tre (93.36Kb)
Details View File Details
Title NEO_alignment
Downloaded 9 times
Description Alignment of fern and hornwort neochrome.
Download NEO_alignment.fas (200.2Kb)
Details View File Details
Title NEO_ML_tree_garil
Downloaded 3 times
Description The maximum likelihood tree inferred from "NEO_alignment.fas", using Garli with genthreshfortopoterm set to 1,000,000 and 8 independent runs. We partitioned the data by codon position, and GTR+Γ+I, GTR+Γ+I, GTR+I models were applied to each codon position respectively. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_ML_tree_garil.tre (3.878Kb)
Details View File Details
Title NEO_ML_tree_pos12_garli
Downloaded 3 times
Description The maximum likelihood tree inferred from "NEO_alignment.fas" (third codon excluded), using Garli with genthreshfortopoterm set to 1,000,000 and 8 independent runs. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_ML_tree_pos12_garli.tre (14.65Kb)
Details View File Details
Title NEO_ML_tree_pos3_garli
Downloaded 4 times
Description The maximum likelihood tree inferred from "NEO_alignment.fas" (first and second codon excluded), using Garli with genthreshfortopoterm set to 1,000,000 and 8 independent runs. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_ML_tree_pos3_garli.tre (13.57Kb)
Details View File Details
Title NEO_ML_tree_codonPhyML
Downloaded 3 times
Description The maximum likelihood tree inferred from "NEO_alignment.fas", using CodonPhyML. We used the GY model with four categories of non-synonymous/synonymous substitution rate ratios drawn from the discrete gamma distribution, and codon frequencies were estimated from the data under the F3X4 model. The tree topology search was done using the NNI approach, and branch support was estimated using the SH-like aLRT method. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_ML_tree_codonPhyML.tre (3.444Kb)
Details View File Details
Title NEO_MLBS_tree_pos12_raxml
Downloaded 9 times
Description The maximum likelihood bootstrapping trees from "NEO_alignment.fas", using RAxML (1000 replicates; third codon excluded). Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_MLBS_tree_pos12_raxml.tre (1.657Mb)
Details View File Details
Title NEO_MLBS_tree_pos3_raxml
Downloaded 2 times
Description The maximum likelihood bootstrapping trees from "NEO_alignment.fas", using RAxML (1000 replicates; first and second codon excluded). Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_MLBS_tree_pos3_raxml.tre (1.657Mb)
Details View File Details
Title NEO_BI_con_tree_MrBayes
Downloaded 4 times
Description The 50% majority consensus tree from MrBayes run (25% of the total generations were discarded as burn-in), based on "NEO_alignment.fas". The analysis was carried out with two independent MCMC runs, four chains each, and trees sampled every 1000 generations. We partitioned the data by codon position, and GTR+Γ+I, GTR+Γ+I, GTR+I models were applied to each codon position respectively. Substitution parameters were unlinked and the rate prior was set to vary among partitions. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_BI_con_tree_MrBayes.tre (32.82Kb)
Details View File Details
Title BlueDevil
Downloaded 13 times
Description A Python scripts to extract gene homologs from 1KP transcriptomes. Sequences for gene-of-interest are queried by tBLASTn and the significant hits to transcriptome scaffolds are extracted. For each scaffold, the best open reading frame is identified, and the sequence is translated into amino acids and then BLASTp queried against the NCBI non-redundant protein database (nr). The scaffolds is discarded if they did not match the homologs in the nr database with an e-value threshold of <0.001. The filtered scaffolds from SOAP de novo and SOAP de novo trans assemblies are then merged using CAP3.
Download BlueDevil.py (34.66Kb)
Details View File Details
Title bluedevil_settings
Downloaded 4 times
Description The configuration file for BlueDevil.py
Download bluedevil_settings.txt (2.325Kb)
Details View File Details
Title DomainDivider_phot
Downloaded 11 times
Description A Python scripts to build phototropin alignment based on conserved domains (i.e. LOV1, LOV2 and STK). The search results from NCBI Conserved Domain Database is parsed to identify domain boundaries and extract domain sequences. Each domain is separately aligned (based on the amino acid sequences) using Muscle, and then concatenated.
Download DomainDivider_phot.py (27.06Kb)
Details View File Details
Title DomainDivider_phy
Downloaded 1 time
Description A Python scripts to build phytochrome alignment based on conserved domains (i.e. PAS, GAF, PHY, PAS repeats, HisKA and HATPase). The search results from NCBI Conserved Domain Database is parsed to identify domain boundaries and extract domain sequences. Each domain is separately aligned (based on the amino acid sequences) using Muscle, and then concatenated.
Download DomainDivider_phy.py (27.65Kb)
Details View File Details
Title DomainDivider_neo
Downloaded 8 times
Description A Python scripts to extract conserved domains in neochrome (i.e. PAS, GAF, PHY, LOV1, LOV2, and STK). The search results from NCBI Conserved Domain Database is parsed to identify domain boundaries and extract domain sequences.
Download DomainDivider_neo.py (20.00Kb)
Details View File Details
Title Readme
Downloaded 7 times
Download readme.txt (12.38Kb)
Details View File Details
Title NEO_MLBS_tree_raxml
Downloaded 3 times
Description The maximum likelihood bootstrapping trees from "NEO_alignment.fas", using RAxML (1000 replicates). We partitioned the data by codon position, and GTR+Γ+I, GTR+Γ+I, GTR+I models were applied to each codon position respectively. Alphanumeric codes following species names are the four-letter 1KP transcriptome identifiers, Genbank accessions or both.
Download NEO_MLBS_tree_raxml.tre (1.657Mb)
Details View File Details

When using this data, please cite the original publication:

Li F, Villareal JC, Kelly S, Rothfels CJ, Melkonian M, Frangedakis E, Ruhsam M, Sigel EM, Der JP, Pittermann J, Burge DO, Pokorny L, Larsson A, Chen T, Weststrand S, Thomas P, Carpenter E, Zhang Y, Tian Z, Chen L, Yan Z, Ying Z, Sun X, Wang J, Stevenson DW, Crandall-Stotler BJ, Shaw AJ, Deyholos MK, Soltis DE, Graham SW, Windham MD, Langdale JA, Wong GK-S, Mathews S, Pryer KM (2014) Horizontal transfer of an adaptive chimeric photoreceptor from bryophytes to ferns. Proceedings of the National Academy of Sciences of the United States of America 111(18): 6672–6677. http://dx.doi.org/10.1073/pnas.1319929111

Additionally, please cite the Dryad data package:

Li F (2014) Data from: Horizontal transfer of an adaptive chimeric photoreceptor from bryophytes to ferns. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.fn2rg
Cite | Share
Download the data package citation in the following formats:
   RIS (compatible with EndNote, Reference Manager, ProCite, RefWorks)
   BibTex (compatible with BibDesk, LaTeX)