CANNABACEAE DATA PACKAGE FROM FU ET AL (JSE) Title: Phylogenomic analysis of the hemp family (Cannabaceae) reveals deep cyto-nuclear discordance and provides new insights into generic relationships This package contains datasets and software outputs (i.e., data processing, fasta filtes, alignments, trees, etc): ######################################### Data_Processing. This folder contains output of "paralog_retriever.py", paralog statistics and result files output of two strategies being applied to paralog processing. strategy_1. We aligned each gene sequences with possible paralogs using MAFFT and inspected results in GENEIOUS. Specifically, we visualized this gene alignment (containing possible paralogs), compared sequences of samples with those of their respective closely related species, and for each sample finally retained as the putative ortholog that sequence having the highest percent identity with the sequences of its relatives. After paralog processing, each gene alignment was further pruned to remove poorly aligned columns using trimAl. strategy_2. We directly excluded from each gene any species/sample with two or more copies, which might be the most certain way to exclude paralogs, without resulting in too much missing data (due to two species per gene and two genes per species on average with possible paralogs. After paralog processing, each gene alignment was further pruned to remove poorly aligned columns using trimAl. (NOTE: Given the similar phylogenetic results of both concatenated and species-tree methods based on these two nuclear datasets, below only results of strategy_1 dataset were presented and used for subsequent analyses) 1_alignments. This folder contains 82 aligned chloroplast genes (*.fasta) and 90 cleaned nuclear alignments with trimAl (*.fas). 2_trees. This folder contains all trees used in this study(*.tre). 1) nuc_genetrees_rr. Rooted gene trees of 90 nuclear loci inferred by RAxML with the GTR-GAMMA model and 200 rapid bootstraps (RAxML_bipartitions.*). 2) nuc_iqtree_concatenation. The concatenated ML tree (90 nuclear genes) inferred by IQ-TREE under the partition model. 3) nuc_raxml_concatenation. Concatenated ML trees (90 nuclear genes) inferred by RAxML under (A) an unpartitioned GTR-GAMMA model and (B) a partitioned GTR-GAMMA model. Additionally, another concatenated ML tree (49 nuclear genes, more than 90% of the sampled species within each gene) was also inferred by RAxML under a partitioned GTR-GAMMA model. 4) nuc_astral. The inference of ASTRAL species tree by (A) directly using gene trees inferred by RAxML and (B) using gene trees with poorly supported branches (i.e., BS support < 10%) collapsed prior to the analysis. 5) cp_raxml_concatenation. Concatenated ML tree (82 chloroplast loci) inferred by RAxML under a partitioned GTR-GAMMA model. 6) The concatenated ML tree (82 chloroplast loci) inferred by IQ-TREE under the partition model. 7) dated_tree_treePL. Time calibration tree inferred by treePL. 8) dated_tree_BEAST. Time calibration tree inferred by BEAST 3_treePL. This folder contains determination of optimal smoothing value and the configuration file used for the treePL analyses. 4_BEAST. This folder contains the XML input for the BEAST analysis. 5_phyparts. Two phyparts analyse by mapping the 90 nuclear gene trees against (A) the nuclear ASTRAL species tree and (B) the chloroplast ML tree. 6_coalescent simulations. This folder contains pruned nuclear ASTRAL species tree and chloroplast tree, nuclear ASTRAL species trees with branch lengths rescaled by two and four, and 1,000 simulated chloroplast trees. 7_PhyloNet. This folder contains three PhyloNet analyses based on the“11-taxon dataset”, “Random1-11-taxon dataset” and “Random2-11-taxon dataset”. Each run contains 13-taxa RAxML trees (RAxML_bipartitions.*) for network analysis and results of MPL analyses from 1-5 hybridization events. ######################################### If you have any question about datasets, analyses and results in Cannabaceae_Data_Package.tar.gz, welcome to contact with Xiao-Gang Fu. The email address is fuxiaogang@mail.kib.ac.cn.