A supermatrix phylogeny of the world’s bees (Hymenoptera: Anthophila)
Data files
Jun 27, 2023 version files 244.76 MB
-
BEE_GENE_ALIGNS.zip
1.86 MB
-
BEE_IQTufbs_mat6b_tplo_1001bin.zip
71.29 MB
-
BEE_IQTufbs_mat6bgen_tempI_tpl10_m.tre
331.18 KB
-
BEE_mat6b_10TCT.phy
169.30 MB
-
BEE_mat6b_fulltree_tplo_tp10.nwk
265.74 KB
-
BEE_mat6b_fulltree.nwk
301.40 KB
-
BEE_mat6b_genera_p8pmAa.treefile
28.56 KB
-
BEE_script_v8.sh
75.91 KB
-
BEE_Supp_file1.xlsx
135.69 KB
-
BEE_Supp_file3.xlsx
1.11 MB
-
BSFPPsu94_TV74c7v2_nn.treefile.pdf
50.58 KB
-
README.md
5.33 KB
Sep 25, 2023 version files 100.20 MB
-
BEE_mat7_520BT.zip
7.40 MB
-
BEE_mat7_fulltree_tplo35_sf20lp.nwk
262.51 KB
-
BEE_mat7_fulltree.nwk
297.34 KB
-
BEE_MAT7_GENE_ALIGNS.zip
2.32 MB
-
BEE_mat7_IQTufbs_tplo_1001bin.nwk
89.54 MB
-
BEE_mat7gen_IQTufbs_tplo_med.tre
334.68 KB
-
BEE_mat7gen_p8pmAa_fst.nwk
35.58 KB
-
README.md
5.48 KB
Oct 05, 2023 version files 81.19 MB
-
BEE_mat7_520BT.zip
7.40 MB
-
BEE_mat7_fulltree_tplo35_sf20lp.nwk
262.51 KB
-
BEE_mat7_fulltree.nwk
297.34 KB
-
BEE_MAT7_GENE_ALIGNS.zip
2.32 MB
-
BEE_mat7_IQTufbs_tplo_1001bin.zip
70.54 MB
-
BEE_mat7gen_IQTufbs_tplo_med.tre
334.68 KB
-
BEE_mat7gen_p8pmAa_fst.nwk
35.58 KB
-
README.md
5.48 KB
Oct 19, 2023 version files 81.19 MB
-
BEE_mat7_520BT.zip
7.40 MB
-
BEE_mat7_fulltree_tplo35_sf20lp.nwk
262.51 KB
-
BEE_mat7_fulltree.nwk
297.34 KB
-
BEE_MAT7_GENE_ALIGNS.zip
2.32 MB
-
BEE_mat7_IQTufbs_tplo_1001bin.zip
70.54 MB
-
BEE_mat7gen_IQTufbs_tplo_med.tre
334.68 KB
-
BEE_mat7gen_p8pmAa_fst.nwk
35.58 KB
-
README.md
5.48 KB
Nov 22, 2023 version files 82.57 MB
-
Almeida75p_TV74j90c7_RAxML_bipartitions.result.pdf
17.16 KB
-
BEE_mat7_520BT.zip
7.40 MB
-
BEE_mat7_fulltree_tplo35_sf20lp.nwk
262.51 KB
-
BEE_mat7_fulltree.nwk
297.34 KB
-
BEE_MAT7_GENE_ALIGNS.zip
2.32 MB
-
BEE_mat7_IQTufbs_tplo_1001bin.zip
70.54 MB
-
BEE_mat7gen_IQTufbs_tplo_med.tre
334.68 KB
-
BEE_mat7gen_p8pmAa_fst.nwk
35.58 KB
-
BEE_script_v9.sh
77.54 KB
-
BEE_Supp_file1.xlsx
138.16 KB
-
BEE_Supp_file3.xlsx
1.10 MB
-
BSFPPsu94_TV74c7v2_nn.treefile.pdf
50.58 KB
-
README.md
5.52 KB
Nov 29, 2023 version files 82.57 MB
-
Almeida75p_TV74j90c7_RAxML_bipartitions.result.pdf
17.16 KB
-
BEE_mat7_520BT.zip
7.40 MB
-
BEE_mat7_fulltree_tplo35_sf20lp.nwk
262.51 KB
-
BEE_mat7_fulltree.nwk
297.34 KB
-
BEE_MAT7_GENE_ALIGNS.zip
2.32 MB
-
BEE_mat7_IQTufbs_tplo_1001bin.zip
70.54 MB
-
BEE_mat7gen_IQTufbs_tplo_med.tre
334.68 KB
-
BEE_mat7gen_p8pmAa_fst.nwk
35.58 KB
-
BEE_script_v9.sh
77.54 KB
-
BEE_Supp_file1.xlsx
138.16 KB
-
BEE_Supp_file3.xlsx
1.10 MB
-
BSFPPsu94_TV74c7v2_nn.treefile.pdf
50.58 KB
-
README.md
5.58 KB
Abstract
The increasing availability of large molecular phylogenies has provided new opportunities to study the evolution of species traits, their origins and diversification, and biogeography; yet there are limited attempts to synthesise existing phylogenetic information for major insect groups. Bees (Hymenoptera: Anthophila) are a large group of insect pollinators that have a worldwide distribution, and a wide variation in ecology, morphology, and life-history traits, including sociality. For these reasons, as well as their major economic importance as pollinators, numerous molecular phylogenetic studies of family and genus-level relationships have been published, providing an opportunity to assemble a bee ‘tree-of-life’. We used publicly available genetic sequence data, including phylogenomic data, reconciled to a taxonomic database, to produce a concatenated supermatrix phylogeny for the Anthophila comprising 4,586 bee species, representing 23% of species and 82% of genera. At family, subfamily, and tribe levels, support for expected relationships was robust, but between and within some genera, relationships remain uncertain. Within families, sampling of genera ranged from 67–100% but species coverage was lower (17–41%). Our phylogeny mostly reproduces the relationships found in recent phylogenomic studies with a few exceptions. We provide a summary of these differences and the current state of molecular data available and its gaps. We discuss the advantages and limitations of this bee supermatrix phylogeny (available online at beetreeoflife.org), which may enable new insights into long-standing questions about evolutionary drivers in bees, and potentially insects more generally.
METADATA
A supermatrix phylogeny of the world’s bees (Hymenoptera: Anthophila)
Patricia Henriquez-Piskulich, Andrew F. Hugall, Devi Stuart-Fox
*A.F.H and P.H.-P. contributed equally to this work.
File: Almeida75p_TV74j90c7_RAxML_bipartitions.result.pdf
Description: PDF figure of RAxML tree of our subset ‘stub’ of Almeida_75p_completeness UCE data matrix (with bootstrap support values).
File: BEE_mat7_520BT.zip
Description: Full 4,591 species 44,780 site supermatrix in phylip format (compressed). Includes outgroups. Note species labels also include information on number of nuclear and mitochondrial genes (n#m#), and family. Information on the data and partitions is in BEE_Supp_file3.
File: BEE_mat7_fulltree.nwk
Description: All 4,591 species tree with ultrafast bootstrap node support values. Assembled from IQTree analyses of the three family subsets. Includes outgroups. Note species labels include gene numbers (n#m#) and family.
File: BEE_mat7_fulltree_tplo35_sf20lp.nwk
Description: All 4,586 bee species tree converted to dated chronogram with treePL, including ufbs node support values. Outgroups removed.
File: BEE_MAT7_GENE_ALIGNS.zip
Description: Folder with the 13 individual ‘gene’ alignments for all taxa in our database, prior to trimming of sites (compressed). FASTA format with consensus first sequence.
File: BEE_mat7_IQTufbs_tplo_1001bin.zip
Description: The all 4,586 bee species chronogram tree plus 1,000 bootstrap samples treePL transformed. First tree in the file is the maximum likelihood version. Outgroups removed. Labels only show species_genus. This tree set is also available online at beetreeoflife.org.
File: BEE_mat7gen_IQTufbs_tplo_med.tre
Description: TreeAnnotator MCC summary tree of treePL transformed genus representative tree bootstrap samples. Outgroups removed.
File: BEE_mat7gen_p8pmAa_fst.nwk
Description: The 428 bee genus representative tree, with ufbs node support values. Includes outgroups. Bee taxa labels show genus_speciesfamilysubfamily~tribe.
File: BEE_script_v9.sh
Description: Assorted Bash and R scripts used to assist in supermatrix construction and checking.
File: BEE_Supp_file1.xlsx
Description: Taxonomic database to reconcile nomenclature of molecular data.
Column headers sheet taxonomy_database:
- previous_binom: previous recognised species name
- final_binom: current species name
- ref_comment: reference for nomenclature decision
- Column headers sheet systematics:
- Family: family taxonomic rank
- Subfamily: subfamily taxonomic rank
- Tribe: tribe taxonomic rank
- Genus: genus taxonomic rank
File: BEE_Supp_file3.xlsx
Description: Supermatrix data summary.
Column headers:
- BEE_mat7_520BT: species cumulative count
- label: working label in trees, showing nuclear/mito genes, and family
- final-taxonomy: final species taxonomy
- family: family of the species
- subfamily: subfamily of the species
- genus: genus of the species
- genes: number of data elements (“genes”)
- bases: total number of aligned nucleotide base characters (IUPAC code)
- sites: total length of th aligned supermatrix
- p-sites: proportion of site with bases
- ex-stub: proportion of site with bases, ignoring the PG stub
DATA SOURCE->: columns to the right indicate data source
- Phylogenomic stub: source data of the phylogenomic stub
- UCE STUB: source data of the UCE stub
- ArgK: accession number ArgK gene
- CAD: accession number CAD gene
- NaK: accession number NaK gene
- Pol II: accession number Pol II gene
- Wnt-1: accession number Wnt-1 gene
- LW Rh: accession number LW Rh gene
- EF-1a: accession number EF-1a gene
- 28S rDNA: accession number 28S rDNA gene
- 16S rDNA: accession number 16S rDNA gene
- COI: accession number COI gene
- CYTB: accession number CYTB gene
BASES->: columns to the right indicate number of aligned nucleotide base characters (IUPAC code) per source
- PG stub: number of aligned nucleotide base characters (IUPAC code) of the phylogenomic stub
- UCE STUB: number of aligned nucleotide base characters (IUPAC code) of the UCE stub
- ArgK: number of aligned nucleotide base characters (IUPAC code) of the ArgK gene
- CAD: number of aligned nucleotide base characters (IUPAC code) of the CAD gene
- NaK: number of aligned nucleotide base characters (IUPAC code) of the NaK gene
- Pol II: number of aligned nucleotide base characters (IUPAC code) of the Pol II gene
- Wnt-1: number of aligned nucleotide base characters (IUPAC code) of the Wnt-1 gene
- LW Rh: number of aligned nucleotide base characters (IUPAC code) of the LW Rh gene
- EF-1a: number of aligned nucleotide base characters (IUPAC code) of the EF-1a gene
- 28S rDNA: number of aligned nucleotide base characters (IUPAC code) of the 28S rDNA gene
- 16S rDNA: number of aligned nucleotide base characters (IUPAC code) of the 16S rDNA gene
- COI: number of aligned nucleotide base characters (IUPAC code) of the COI gene
- CYTB: number of aligned nucleotide base characters (IUPAC code) of the CYTB gene
- NOTES: notes regarding column headers
BEE_mat7 gene site table: gene site summary of file BEE_mat7 gene site table
- part: section of the file
- sites: total length of the aligned supermatrix
- start: where the section starts
- end: where the section ends
- gene: gene or stub belonging to the section
File: BSFPPsu94_TV74c7v2_nn.treefile.pdf
Description: PDF figure of IQ-Tree tree of our composite UCE stub, with bootstrap support. Labels denote taxa from the five original datasets.